Welcome to our forums! Please take a few moments to read through our Community Guidelines (also conveniently linked in the header at the top of each page). There, you'll find guidelines on conduct, tips on getting the help you may be searching for, and more!

Automatic metadata tagging (with filing and renaming) including artwork

MilaNetMilaNet Posts: 10Members ✭✭

The following script and helper function will automate the process of tagging with metadata, including show name, season number, episode number, episode description, and artwork in a square format suitable for iDevices. It additionally converts everything to mp4 format (transcoding if necessary but typically that's not needed - the script will figure it out), renaming the files to a Plex-friendly format, and moving them to an appropriate location. I run it automatically on whatever shows up in my Downloads folder.

To make it work, you will need to install:
1. ffmpeg
2. tvnamer (which also installs the TVDB api)
3. AtomicParsley (currently the only metadata tagging utility that supports the artwork metadata)

This was developed for MacOS; file locations may change for other operating systems. The bash script is named convert_video and contains:

#!/bin/bash

if [  "$#" -eq 1 ]; then
    TR_TORRENT_DIR=$HOME/Downloads/
    TR_TORRENT_NAME=$1
    #If an argument is passed to the script, it sets the values of the variables
    #TR_TORRENT_DIR and TR_TORRENT_NAME, otherwise, it inherits
    #them from transmission-daemon
fi

TR_DOWNLOADS="$TR_TORRENT_DIR/$TR_TORRENT_NAME"
DEST_DIR="/Volumes/Vault/" #destination for fully processed video files
ART_DIR="$HOME/Pictures/" #directory for album artwork

function process_file() {
    filename=$(basename "$1")
    filename="${filename%.*}"
    #capture the filename without the path
    cd $HOME/Downloads > /dev/null 2>&1
    current_time=$(date "+%Y.%m.%d-%H.%M.%S")
    mkdir "$current_time"
    #create a working directory which we will delete later
    if [ -n "$(ffmpeg -i "$1" 2>&1 >/dev/null | grep "h264")" ];
        then
            video_codec="copy";
        else
            video_codec="h264";
    fi
    if [ -n "$(ffmpeg -i "$1" 2>&1 >/dev/null | grep "aac")" ];
        then
            audio_codec="copy";
        elif [ -n $(ffmpeg -i "$1" 2>&1 >/dev/null | grep "ac3") ];
        then
            audio_codec="copy";
        else
            audio_codec="aac";
    fi
    #use ffmpeg to diagnose the codecs for sound and video inside the file
    ffmpeg -i "$1" -hide_banner -loglevel quiet -y \
        -c:v $video_codec -c:a $audio_codec \
        "$current_time/$filename.mp4"
    #process the file into the .mp4 format
    filename=$(tvnamer $current_time/"$filename.mp4" | \
        grep "New filename: " | cut -d:  -f2- | cut -c 2-)
    #rename the file, and capture the new filename
    regex="(.*) S([0-9]{2})E([0-9]{2}) (.*)\.mp4"
    #use a regex to capture the show, season, episode, and title
        if [[ "$filename" =~ $regex ]]; then
        show_name=${BASH_REMATCH[1]}
        season_no=$(echo ${BASH_REMATCH[2]} | sed 's/^0*//')
        episode_no=$(echo ${BASH_REMATCH[3]} | sed 's/^0*//')
        episode_title=${BASH_REMATCH[4]}
        #assign show, season, episode, and title to variables
        tvdb_info="$(/usr/local/bin/gettvdbinfo.py "$show_name" $season_no $episode_no)"
        #get the episode synopsis and unique season ID from theTVdb
        season_id=${tvdb_info##*@}
        descr=${tvdb_info%@$season_id}
        #assign the values to variables
        if [ ! -e "$ART_DIR$show_name $season_no.jpg" ] ; then
            curl -s http://squaredtvart.tumblr.com/search/$season_id | \
            grep -o 'src="[^"]*.[png-jpg]"' | cut -d\" -f2 | sed 's/250/1280/' | \
            while read l; do curl -s "$l" -o "$ART_DIR$show_name $season_no.jpg"; done
        fi
        #if we don't already have it, get the squared tv art from the tumblr for that.
        AtomicParsley "$current_time/$filename" -H "$show_name" \
            -U "$season_no" -N "$episode_no" \
            -S "TV Show" --tracknum "$episode_no" \
            --title "$episode_title" \
            --artwork "$ART_DIR$show_name $season_no.jpg" \
            --description "$descr" --overWrite
        fi
        #use AtomicParsley to apply all the metadata tagging
    if [ ! -e "$DEST_DIR$show_name" ] ; then
        mkdir -p "$DEST_DIR/$show_name"
    fi
    #make a directory for the show, if needed
    cp "$current_time/$filename" "$DEST_DIR$show_name/$filename"
    #move it there
    SetFile -a E "/Volumes/Vault/$show_name/$filename"
    #suppress display of the file extension
    rm -fr "$current_time"
    #clean up after yourself
}

if [ -d "$TR_DOWNLOADS" ]; then
    IFS=$(echo -en "\n\b")
    for f in $(find "$TR_DOWNLOADS"); do
        if ! [ -d "$f" ]; then
            case "$f" in
            *.mkv | *.mp4 | *.avi | *.m4v )
                    process_file "$f"
                    ;;
                    #just work on files with video extensions
            esac
        fi
    done;
    unset IFS
    #if the video data is in a folder, just do stuff to the actual video file
else
    process_file "$TR_DOWNLOADS"
    #otherwise just process it where it is
fi

The helper function is in python, is called gettvdbinfo.py, and looks like this:

#!/usr/bin/python
# gettvdbinfo.py
SINGLE_QUOTE_MAP = {
    0x2018: 39, 
    0x2019: 39,
    0x201A: 39,
    0x201B: 39,
    0x2039: 39,
    0x203A: 39,
}

DOUBLE_QUOTE_MAP = {
    0x00AB: 34,
    0x00BB: 34,
    0x201C: 34,
    0x201D: 34,
    0x201E: 34,
    0x201F: 34,
}

def convert_smart_quotes(str):
    return str.\
    translate(DOUBLE_QUOTE_MAP).\
    translate(SINGLE_QUOTE_MAP)

import sys, tvdb_api
cmdargs = str(sys.argv)
t = tvdb_api.Tvdb()
episode = t [str(sys.argv[1])] [float(sys.argv[2])] [float(sys.argv[3])]
ov = episode['overview']
spacer = "@"
sid = episode['seasonid']

ov = convert_smart_quotes(ov)
ov += spacer
ov += sid
print ov
Sign In or Register to comment.