Absolute Series Scanner (for anime mainly)

ZeroQIZeroQI Members Posts: 1,158 ✭✭✭

I couldn't understand the way the official (or even third party) scanner worked [with statements like "elif len(paths) > 0 and len(paths[0]) > 0" that made me unable to follow loops accurately] and i was having Scanner induced issues with my Http AniDB Metadata Agent:

   . year in series not passed to the metadata agent

   . series with semicolon (steins;gate) troncated before the semicolon

   . subfolders or series grouped in parent folder not being handled

   . roman episode numbers not processed

   . no management for multiple series episodes in one folder

   . no log files like a metadata agent have

   . no skipped files list


Therefore the only option left was to write one scanner following in the lines of BABS with a source code i could read.

You can find it here: https://github.com/ZeroQI/Absolute-Series-Scanner/blob/master/Series/Absolute Series Scanner.py


The scanner works brilliantly for me (implemented all points above), including the log generation.

i needed to have all writing rights on folders " /homes/plex " for users group on my Synology box (XPEnology)


However i need some assistance if possible to test the log functionality, potential bugfixes, and feedback

It wil (normally) create a "Plex Media Scanner Custom.log" and also ""Plex Media Scanner Custom - Skipped files.log" file (for me in /homes/plex)


Can you tell me your OS version, and the path to the logs and any manipulations that were needed to enable write access?  Could you also attach both:

   . "Plex Media Scanner Custom.log"

   . "Plex Media Scanner Custom - Skipped files.log"


Anybody knows if a scanner can be included in the unsupported appstore? if so, what format/naming does the repository need to have ?



  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    With BABS OP-ED 406 series, 8764 eps recognised splitted in 27 folders 0..9, A to Z, added one by one to library

    With Absolute Series Scanner 449 series (bug with 2 series with no title, 450 after bug corrrection) 8969 recognised episodes

    => most are automatically recognised movies

    here is attached the logs i got from the scanner...


       . Synology:  Logs folder "/volume1/Plex/Library/Application Support/Plex Media Server/Logs/" and  current folder "/volume1/homes/plex"

       . FreeNAS 9.2:  Logs folder "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Logs"

  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    I am using FreeNAS 9.2 I installed plex through the plugins.

    The log directory is "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Logs"

    I created the directory /homes/plex/ and gave it 777 permissions.

    When I ran the scanner it didn't pick up any anime.

    I checked the "/homes/plex" and "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Logs" but there where no log files.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited August 2015

    Thanks for the prompt feedback.

    I am unsure where (since after using fixed path, i used the current directory i had no influence on and it worked) and if the logs gets created (due to write rights missing for example)

    Plex Media Scanner Custom.log will paste the below each time it is ran: (for my synology, it will give:)

    os.uname():   'LinuxNAS3.2.40#15 SMP Sun Nov 17 00:44:13 CET 2013x86_64'
    Sys.platform: 'linux2'
    os.name:      'posix'
    os.getcwd:    '/volume1/homes/plex'
    For  FreeNAS 9.2
    os.uname():   'FreeBSDplexmediaserver_19.1-RELEASEFreeBSD 9.2-RELEASE-p4 #0 r262572+17a4d3d: Wed Apr 23 10:09:38 PDT 2014     root@build3.ixsystems.com:/tank/home/jkh/build/9.2.1/freenas/os-base/amd64/fusion/jkh/9.2.1/freenas/FreeBSD/src/sys/FREENAS.amd64amd64'
    Sys.platform: 'freebsd9'
    os.name:      'posix'
    os.getcwd:    '/'


    The scanner will go in ~/Library/Application Support/Plex Media Server/Scanners/Series/Absolute Series Scanner.py

    The folder "Scanners" and "Series" inside it will have to be created...



    I used the current folder to write the log because i could not know the absolute path for all OS, but i want this scanner to work everywhere. The log may not be present on some os.
    if that is the case, put an absolute path with write access in the log function by replacing:
    with open(filename, 'a')
    with an absolute path  and post the result file, it should look like: (again, on my synology box, path vary with the OS)
    with open("/volume1/Plex/Library/Application Support/Plex Media Server/Logs/" + filename, 'a') as file:

    In case of issues, and logs are generated, post the following:

       . Logs/Plex Media Scanner.log that will show if my code is buggy

       . Plex Media Scanner Custom.log (it will include current working directory)
       . Plex Media Scanner Custom - Skipped files.log
    I did some amendments and will post again the results file. i like the look with a line per file entry, and what it matched, and the list of the files that didn't make the cut...
  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    I guess when I downloaded the file through wget it changed a few characters in the comment on line 24 to unicode once I deleted the unicode chars it worked.

    BABS finds 319 Anime

    This script finds 313 Anime

    Nisekoi is scanned in as Ni Koi

    Blood+ scans in as Blood: The Last Vampire

    metadata can't be found for:


    Read Or Die TV



    All of the episodes in skipped files don't show up in plex.

    I have attached the logs and I will update if i find anything else weird

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    Thanks a lot for the update and attaching logs, i was looking forward to that.

    it is tricky to develop just in your corner and no feedback...

    The current folder for the scanner is '/'. were 'Plex Media Scanner Custom.log' and 'Plex Media Scanner Custom - Skipped files.log' both in root folder or were they in usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Logs ?
    Looking at scanner logs, here is the series name: 'Ckbc', 'Read or Die Tv', 'Mnemosyne', 'Steins Gate', Nisekoi (one of corrected bugs). For "blood+" the + gets removed, i need to make sure it stays, possibly due to Plex file cleanup, will test without

    All are valid titles or synonyms that should find a match but the metadata is being handled by the agent.

     would need "com.plexapp.agents.hama.log" (or .1 to .5 depending in which one is mentioned the series above) to narrow that down why it didn't catch it (better if done with debug enabled)

    I went through your skipped list, and you seem to use anidb perl renamer, as all you naming is perfect

    Since your serie name match in both the filename and foldername, it should always take the folder as serie name.

    "Sayonara Zetsubou Sensei - Preface  (recap episode)." is the only file that should be skipped with the new version since "Preface" exist and cannot be matched to a special numbering of episode number. 

    i had made a mistake that resulted in "SE", "TS" and other terms to be removed even if part of substrings and a recent update prevented movies from being detected. Also aniDB matching catched anything beginning with s or ED or OP which was far from good...  and i can see it misread most of the skipped files... That was corrected, but i forgot to upload it yesterday

    i have uploaded a newer version on github, and i also attached it here as well (remove ".txt"). Please try out.

    i need a function to translate theses unicode and other characters accurately, (like the 'o' in CØDE:BREAKER) but i struggle to find it...

  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass
    edited July 2014

    I did a little digging for that Unicode problem and I think this looks promising. http://stackoverflow.com/questions/175240/how-do-i-convert-a-files-format-from-unicode-to-ascii-using-python

    Specifically the unicodedata.normalize function.

    The article from the comments of the answer looks good for the odd characters like Æ, Ð, Ø, Þ, æ, ð, ø, ß, and þhttp://effbot.org/zone/unicode-convert.htm

    You will probably have to add this line to the top of the script for unicode to work

    # -*- coding: utf-8 -*-

    I did not see the Logs in the root, "/home/plex", or the "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Logs" folders.

    I had to add this above line 122 to get the logs working.

    filename = "/home/plex/" + filename
    with open(filename, 'a')

    I also noticed the name for Zetsubo was wrong so I fixed that. I actually don't use the perl anidb renamer. I wrote one myself as a python learning exercise and I have been using it. I actually posted it on github if you are interested in taking a look at it. I wouldn't suggest using it as it is buggy and incomplete. I don't have the time right now to maintain it but someday I will rewrite it.

    Here is the link https://github.com/jotaro0010/pyanisort.

    Also Nisekoi does work now. the series with metadata problems before still have the same problems

    I have attached the new logs. I didn't know which hama log contained the problem shows so I'll post all 6

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    The roman numbering before the other regex was flawed and not my finest programming hour

    I moved it and all the problematic titles i got from your files seem to work now.

    Uploaded to https://gist.github.com/ZeroQI/11b6036e16adb424b938 and attached to this post

    For the agent:

       . the supposedly optional data folders are in fact necessary: "XMLs"  and all poster folders seem missing, it generate few errors. i need to see if i can make the caching optional

       . there is another bug i need to fix but it point to the wrong line... will fix when i fix current issues on the scanner

    However, couldn't find the metadata missing in the logs, as it only keep5 logs and you have a sizeable collection of anime...

    Let me know if this new scanner works better for you (created test files from your logs just for that)...

  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    Sweet, it didn't skip any episodes this time there was no skipped file log. I will post the other log in case you wanted to see it. I added 'All' to the beginning to show it was the logs for my entire anime collection.

    I realize I have a lot of anime so I copied the problem anime to their own folder called PlexTesting (The List: Blood+, CKBC, Kaiba, Kishin Heidan, Mnemosyne, Read Or Die TV, and Steins;Gate). I also added a couple that worked like Kaiba, Black Lagoon, and 3x3 Eyes. I deleted the existing hama and scanner logs and scanned the test folder to generate new logs that are hopefully easier to look at.  I will attach them with 'PlexTest' added to the beginning of the file name.

    Thanks for taking the time to make this I am sure it will be great when finished :D.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    Thanks a lot for that, you are a star and my only tester so far :D

    Your naming convention is so clean, all gets picked, including ending and openings
    "TypeError: not enough arguments for format string" after getImagesFromTVDB - Item number: 2, posters: 0, Poster ids: 
    I will now work on the agent part to debug these errors. could be linked to caching, so will release a new version to solve that
    Plex Media Scanner.log the official log can be useful to check for errors (they will cross the timestamps if any so easy to spot)
       . if no error is in there, no need to include
       . if there is error there, then the scanner crashed and there's a bunch of things not scanned then...
  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    I deleted the existing Plex Media Scanner.log files and recreated the plex library for the small plextest folder so I could be sure I checked the right  Plex Media Scanner.log file for errors. There where none and it somehow grabbed the data for Read or Die TV and Mnemosyne. Steins;Gate, CKBC, and Kishin Heidan are the only shows not grabbing metadata. even though I did not see any errors in the Plex Media Scanner.log file but there where a few in the hama log so I will post them anyway in case you can find something.

    Hopefully more people are willing to test this scanner for you. Even with the three metadata problems this scanner works better than BABS.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    scanner: no crash, the custom log seem to show all is picked up with no issue

    Agent issues:

    "TypeError: not enough arguments for format string" also got corrected some time ago if i remember well

    strptime : This error occurs due to lime function being launched in concurrent threads, which forced me to do locking in the anidb xml treatement function. will be slower, but will respect the 2s delay per request to limit temporary ban by AniDB. also use local cache so if banned, it uses the last saved xml...


    [Errno 2] No such file or directory

       . /usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-in Support/Data/com.plexapp.agents.hama/DataItems/XMLs/

       . /usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-in Support/Data/com.plexapp.agents.hama/DataItems/TVDB/graphical/


       . [...]

    Go into "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-in Support/Data/com.plexapp.agents.hama/DataItems/" and create the following folders in it:

    • "AniDB"
    • "Plex"
    • "TVDB/_cache/fanart/original"
    • "TVDB/fanart/original"
    • "TVDB/fanart/vignette"
    • "TVDB/graphical"
    • "TVDB/posters"
    • "TVDB/seasons"
    • "TVDB/seasonswide"
    • "TVDB/text"

    Also, i commissioned in the source quite some time ago "XMLs" to split it per source so you do have an older Hama plugin

    To reward those who look here, here is a fresh zip archive. it contain:
       . "Absolute Series Scanner". you may need to correct Log function to point to an absolute path where you have write access and if you do, you will have a normal log and a file skipped log
       . "HTTP AniDB Metadata Agent" modified to use locking between thread, and smaller log files so i am more likelly to find the issues
       . data folders + empty html report files (to update databases or see missing episodes)
    If files are not showing in Plex, the scanner alone is responsible.
    "Plex Media Scanner.log" will show if the scanner crashed as it will cross the timestamps. If so, include this log and i will correct the scanner source code.
    If it didn't crash, these two logs (you may need to put the path to a folder with write access)
       . Plex Media Scanner Custom.log  
       . Plex Media Scanner Custom - Skipped files..log  will have all files that didn't made the cut. send that if you believe it is unjust and specify which files should be processed correctly
    If the serie and fiels are showing but with no data, the metadata agent is responsible.
    Make sure you created the folder in the plugin data folders and include this logs with the symptoms
       . "com.plexapp.agents.hama.log" 
    This is for everybody but jotaro0010 as he give me the right files, delete the logs before a new scan so i only have relevant information. Thank you jotar0010, i made good progress thanks to you. Anybody is free to have a look at the code, If you only have c, c++ or oven PHP experience, you will have no problem, i never touched python before taking over hama.
  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    Oh I didn't realize I was using an old version of HAMA when I updated using the zip you just posted it found everything except Kishin Heidan on its first try. When I did fix incorrect match on Kishin Heidan it found the metadata using the HAMA agent.

    I am glad I could help  :D. If there is anything else you want me to test for you feel free to pm me. I would be happy to help.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    Well I would love to know it scales compared to babs scanner. Have a look at the htm logs you even have a list of files missing!!! If you have ideas for improvement like the collection setting, or auto downloads or bugs encountered I would like to know...
  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    Ok when you asked for the comparison to BABS  I noticed the number was smaller than my total anime collection. 

    BABS found 319

    Absolute Series Scanner found 327

    My total Anime 329

    Two shows get messed up so I had to spend a bit of time and figure out which they where.

    The first show is "Ninin ga Shinobuden The Nonsense Kunoichi Fiction" It doesn't get scanned at all

    The second is "Shinryaku! Ika Musume 2" This one is scanned in as "Shinryaku! Ika Musume"

    I put these two shows and the first season of Ika Musume in a plex test folder deleted the logs and scanned the library in. The first show did not show up and Ika Musume's second season combines with the first.

    It seems like the hama log was not made when I did all this so I will only be able to attach the Plex Custom and the Plex Scanner logs.

    Also I looked at the html file for the missing episodes and it seems like I just don't have those files. That list will be helpful when I go grab the ones I am missing. The only one that is wrong is the missing episodes for Dragon Ball and Dragon Ball Z as I decided to put those into season folders instead of absolute numbering. I will have the TVDB grab those series.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    Thanks for the logs. I noticed with scanners, when they are modified, you need to recreate the library or you could have some difference with the folder title it would have given with the new version... or manually correct the title and match the serie..

    We may have a vocabulary issue there

    "Ninin ga Shinobuden The Nonsense Kunoichi Fiction" scans (with the scanner) properly but as the title of the serie in the episode is not contained in the folder title it sticks with it so "Ninja Nonsense The Legend of Shinobu" is taken, which doesn't match properly but you should be able to fix by renaming the episodes to match the title or match manually the serie in plex searching for the folder name...

    For "Shinryaku! Ika Musume 2" the title get given by the scanner properly as per the log, so must be an agent issue

    try to restart plex and to create the log. it is possible it merge it with the first one, so go and split the series. i would need the agent logs

    If i can discover 100% of what you have, i would assume the scanner to be good as i want it the best it could be...

  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    I renamed the ninja show to 2x2 Shinobuden which is a name listed in the anidb. It still doesn't scan in I don't even see it in plex so I can't do anything with it there. I tried using BABS to scan the test folder and It finds the show and gets the metadata. This seems like a scanner problem to me

    I have logs that show info for Ika Musume I will post them. I can split the show and fix the incorrect match but I figured you would want to try and debug what is wrong first.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    Hi, sorry for doubting you, i had to make sure and i didn't got any error, and the custom log did look good but the Plex Media Scanner.log indicated an error in my code:

    Jul 13, 2014 10:21:03 [0x807007400] DEBUG - Adding file for scanner: /mnt/Media/Videos/plex test/2x2 Shinobuden/poster.jpg
    Jul 13, 2014 10:21:03 [0x807007400] ERROR - Error in Python: Running scanner:
    Traceback (most recent call last):
      File "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Scanners/Series/Absolute Series Scanner.py", line 319, in Scan Log("show: '%s', year: '%s', season: '%s', ep: %s found using episode_re_search on cleaned string '%s' gotten from filename '%s' also ep_nb: '%s'" % (show, xint(year), xint(season), xint(episode), ep, filename), ep_nb)TypeError: not enough arguments for format string

    Now for the serie mismatches, it is named and detected correctly in the custom logs... in the normal scanner log:

    Jul 13, 2014 10:21:04 [0x807007400] DEBUG -  -> We found a local media item with rooted metadata in Shinryaku! Ika Musume

    could be an issue with the serie id due to previous bug in the source that could have been corrected since.

    I would need you to update the version, and recreate the test library, as i believe this will solve. if not i will need the scanner, custom scanner and agent logs as always.

    For the agent:

    2014-07-13 10:23:42,111 (80f417000) :  CRITICAL (storage:87) - Exception writing to /usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-in Support/Data/com.plexapp.agents.hama/DataItems/TVDB/blank/195721.jpg (most recent call last):
      File "/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-ins/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/components/storage.py", line 79, in save
        f = open(tempfile, mode)
    IOError: [Errno 2] No such file or directory: '/usr/pbi/plexmediaserver-amd64/plexdata/Plex Media Server/Plug-in Support/Data/com.plexapp.agents.hama/DataItems/TVDB/blank/._195721.jpg'

    The folder "blank" inside TVDB was missing. corrected that

  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    This update did fix the Ninja show, it successfully scanned in. However the first season of Ika Musume still scans in as the second season. I have posted the logs.

  • ZeroQIZeroQI Members Posts: 1,158 ✭✭✭
    edited July 2014

    that's weird, all episode titles match the serie 2: http://anidb.net/perl-bin/animedb.pl?show=anime&aid=8294

    The serie title is the one of the serie 2... it has '!?'

       . Serie 1 main title: Shinryaku! Ika Musume 

       . Serie 2 main title: Shinryaku!? Ika Musume

    2014-07-13 19:30:45,773 (80636e400) :  DEBUG (__init__:596) - parseAniDBXml - AniDB title need no change: 'Shinryaku!? Ika Musume' original title: 'Shinryaku!? Ika Musume' metadata.title 'Shinryaku!? Ika Musume'
    They are scanned ok, scrapped ok, labelled ok
    the titles must match to group them so it is possible plex remove '?' from the title and match the 2, but i cannot prevent that...
    I could replace '?' with another character ('!' ?) but that's not a nice solution...
    Thanks jotaro0010 did quite few fixes thanks to you. Anybody else have feedback ?
  • jotaro0010jotaro0010 Members, Plex Pass Posts: 20 Plex Pass

    This is just a wild guess but the reason that season 2 gets picked is because it is at the top of the list of possible options. When I do a fix incorrect match the Shinryaku!? Ika Musume is at the top and just underneath is Shinryaku! Ika Musume. If it picks the first option to grab the metadata for then it would always pick the second season. Does this make sense?

