[Release] HTTP Anidb Metadata Agent (HAMA)

CptUndies · June 19, 2013, 11:32am

Thanks for the response. The problem seems to have resolved itself now. Could it have been a server capacity issue?

Atomicstrawberry · June 19, 2013, 11:37am

Unlikely. I just had it happen to me and it looks like basically the agent is being left running in a zombie state where it's not responding to requests from the main system. Killed the process and everything unblocked and started working again. I'll do some more tracing through and see if I can find out what's going on. Hard to reproduce though since it seems to take time rather than being to do with specific input.

allyorbase · June 21, 2013, 11:59pm

I'm loving this, but I'm having a problem with HAMA falsely recognizing TV shows as Movies. I have a few episodes of Karneval, for instance, and when I try to load them as TV shows, they don't show up, but when have them set as movies and set the Agent to HamaMovies it finds them all, and condenses them all into one movie. It also renames them all Karneval (within Plex) so I don't know which is which.

Disregard, I'm an idiot. I had selected the wrong scanner.

saitoh183 · July 4, 2013, 2:05am

i tried this with babs but it would not work. my anime is setup:

d:\Anime\Anime_name\anime_name - XX - eps_tittle [short_name_realease_group].ext (ie: D:\Anime\Escaflowne\Escaflowne - 26 - Eternal Love [G_P].mkv)

I dont have season since i used Anidb mod scraper in XBMC and it doesnt create seasons since anidb doesnt have any.

Atomicstrawberry · July 4, 2013, 10:34am

i tried this with babs but it would not work. my anime is setup:

d:\Anime\Anime_name\anime_name - XX - eps_tittle [short_name_realease_group].ext (ie: D:\Anime\Escaflowne\Escaflowne - 26 - Eternal Love [G_P].mkv)

I dont have season since i used Anidb mod scraper in XBMC and it doesnt create seasons since anidb doesnt have any.

This should work fine. Is Plex seeing the episodes and not populating metadata, or are you not even seeing the episodes?

If it's not seeing the episodes themselves then I can't help much - I'd suggest trying the BABS scanner thread. As far as I'm aware the file naming convention you're using should be fine.

If it's seeing the episodes but not matching them for pulling info, then can you locate the com.plexapp.agents.hama.log file in the [logs](http://wiki.plexapp.com/index.php/PlexNine_Tips_and_Tricks#Log_File_Locations) and attach it here and I'll take a look.

saitoh183 · July 4, 2013, 6:16pm

This should work fine. Is Plex seeing the episodes and not populating metadata, or are you not even seeing the episodes?

If it's not seeing the episodes themselves then I can't help much - I'd suggest trying the BABS scanner thread. As far as I'm aware the file naming convention you're using should be fine.

If it's seeing the episodes but not matching them for pulling info, then can you locate the com.plexapp.agents.hama.log file in the logs and attach it here and I'll take a look.

Its not seeing anything at all

Chivon8 · July 11, 2013, 4:06am

Thanks for this scraper, works well for me and I finally have cover art for my anime :D

Atomicstrawberry · July 11, 2013, 10:26am

Its not seeing anything at all

I just started getting a similar issue (must have picked up a system bundle update) and it looks like they changed something inside Plex and broke BABS. Fix here:

http://forums.plexapp.com/index.php/topic/31081-better-absolute-scanner-babs/?p=421065

ZeroQI · July 25, 2013, 11:15pm

Amazing job there... I will try to help there as i believe there is two XML files that could dramatically change the direction of this already great plug-in...

i believe due to the poster size restriction (680x1000) that thetvdb posters are better as the ratio is constant, but as the settings is changeable in Hama.bundle/Contents/DefaultPrefs.json, no pb:

[ { "id": "GetTvdbFanart", "label": "Attempt to fetch Fanart images from TVDB", "type": "bool", "default": "true", },

{
“id”: “GetTvdbPosters”,
“label”: “Attempt to fetch Poster images from TVDB”,
“type”: “bool”,
“default”: “true”,
},

{
“id”: “GetTvdbBanners”,
“label”: “Attempt to fetch series Banner images from TVDB”,
“type”: “bool”,
“default”: “true”,
},

{
“id”: “PreferAnidbPoster”,
“label”: “Prefer AniDB’s poster over TVDB if TVDB lookup enabled”,
“type”: “bool”,
“default”: “true”,
},
]

I have 864 anime folders and some are not recognised despite correct name, but not that many (15% about).

The removal of special characters is for some of the bad detection, but i need to look more in depth at the code... i would prefer a strict string comparison with the folder name but maybe that's just me, Anyhow, you'll see where i am going...

Some time ago (before i use a synology NAS with Plex integrated) i was playing with XBMC and there was an AniDB plugin, and another modded pretty much like you did... The guy did create an XML to map anidb and thetvdb series and even map episodes together to get hte best of each database...

Here is the link to the thread: http://forum.xbmc.org/showthread.php?tid=64587&pid=1199733#pid1199733

Here is the XML format of anime-list-full.xml

?xml version="1.0" encoding="utf-8"?>



Seikai no Monshou

Sunrise




Chobits  
;1-3;2-4; 
;9-1;18-2; 
;10-9;11-10;12-11;13-12;14-13;15-14;16-15;17-16;19-17;20-18;21-19;22-20;23-21;24-22;25-23;26-24; 
 ;1-27;2-28; 
 

[...]

On http://wiki.anidb.net/w/API there an xml data dump [http://anidb.net/api/anime-titles.xml.gz], XML looking like this:

xml version="1.0" encoding="UTF-8"?>


	
		CotS
		Crest of the Stars
		Crest of the Stars
		Crest of the Stars
		HvÄ›zdnÃ½ erb
		Seikai no Monshou
		SnM
		æ˜Ÿç•Œä¹‹çº¹ç« 
		æ˜Ÿç•Œã®ç´‹ç« 
	

        
                ìµ¸ë¹—ì¸ 
                ã¡ã‚‡ã³ã£ãƒ„
                Chobits
                Chobits
                Chobits
                Chobits
                Chobits
                Chobits
                Ð§Ð¾Ð±Ð¸Ñ‚Ñ
                Ð§Ð¾Ð±Ð¸Ñ‚Ñ‹
                Ð§Ð¾Ð±Ñ–Ñ‚Ð¸
                äººå½¢ç”µè„‘å¤©ä½¿
        

[...]

We could copy locally both XML files, search anime-titles.xml for the titles, get the anime ID (aid) and use it to have thetvdb.com mappings and thetvdb id all done with no internet queries at all, so the scraping is done much quicker without network access for AniDB (am i an idealist? and no 2s delay, no ban...) and we get the best of thetvdb...

BABS give us the absolute numbering so all is there to make it happen. Link to BABS here:

http://forums.plexapp.com/index.php/topic/31081-better-absolute-scanner-babs/

I will see if i can see a way but never did python before so will be tricky, but as the files are already there, that should facilitate greatly the job and concentrate all efforts to maintain the mapping file.

Hope you all find the above information useful and that we can add this...

Atomicstrawberry · July 26, 2013, 2:41am

I agree on the TVDB posters being more consistent size, and personally I use them as well. The reason I don't set them as the default option in the preferences is that because of the asinine way that TVDB handles seasons in a lot of cases, you'll end up getting a poster for the wrong series, sometimes with completely different art etc.

The anidb titles XML is too big to efficiently parse locally, but that's actually how the name lookup is working at the moment - I'm running a database on my own server which has the anime titles XML parsed into a searchable database.

There's a couple of bugs in the TVDB lookup which cases it to miss some stuff, which I've fixed locally but haven't uploaded an updated version here yet as there's a couple of other things I want to fix or improve first. I'll have a look at this mapping file though, as it sounds pretty useful assuming it's kept up to date. However I think that I'll run into the same issue here - the XML will be too big and complex to parse locally. I may need to add it to the service I'm running and that would be complicated so I'd rather think about ways to override the TVDB match instead in the rare cases where it misses the match. The other issue is that Plex metadata agents are kind of limited in the file I/O they can do. I thought about having a helper file inside the directory to override the TVDB stuff per-series or something but I'm not sure yet whether it's possible.

The other thing to bear in mind with TVDB matching is that when we match, we look at all the alternative titles for the show as well, so often when you've got new seasons that have a different name (some random word or letter tacked onto the end etc) the match will fail unless TVDB's entry has been updated to accept that alternative title. For newer shows that can take a few weeks to happen, I've found.

ZeroQI · July 26, 2013, 1:52pm

Hi Atomicstrawberry. I was thinking about pre-processing the xml with a batch file to remove un-needed language and shorts but ended up doing it on the flat file.... We will be left with language "en" and "x-jat" for "main" title (x-jat only) and "official" title and "syn" in "en and x-jat"

Here is a batch that take the dat file to recreate 639KB of file sorted alphabetically, taking 10 seconds.

@ECHO OFF
REM extract ||| into variables %a %b %c %d and run loop for each line

REM type 1=primary title    (one per anime)

REM type 2=synonyms         (multiple per anime)

REM type 3=shorttitles      (multiple per anime)

REM type 4=official title   (one per language)
REM Create empty file

Echo [titles  .txt] file created empty

TYPE NUL>titles.txt
for /F “tokens=1,2,3,4 delims=^| skip=3” %%a in (anime-titles.dat) do (

IF NOT “%%b”==“3” (

IF “%%c”==“x-jat” Echo %%a^|%%b^|%%c^|%%d >>titles.txt

IF “%%c”==“en”    Echo %%a^|%%b^|%%c^|%%d >>titles.txt

)

)
Echo [titles  .txt] file content processed

pause

Here is one that splits into primary titles, main titles, and synonyms.

We could even include in in the python script as variable if local file access is not possible...

@ECHO OFF
REM extract ||| into variables %a %b %c %d and run loop for each line

REM type 1=primary title    (one per anime, X-jat 305kb)

REM type 2=synonyms         (multiple per anime, 166KB en, 80kb jp)

REM type 3=shorttitles      (multiple per anime)

REM type 4=official title   (one per language, mainly english, 4 in X-jat, 81KB)
REM Create empty file

Echo Files created empty

TYPE NUL>primary.txt

TYPE NUL>Official.txt

TYPE NUL>Synonyms_jp.txt

TYPE NUL>Synonyms_en.txt
for /F “tokens=1,2,3,4 delims=^| skip=3” %%a in (anime-titles.dat) do (

IF “%%b”==“1”   Echo %%a^|%%b^|%%c^|%%d >>Primary.txt

IF “%%b”==“4”  ( IF “%%c”==“x-jat” Echo %%a^|%%b^|%%c^|%%d >>official.txt 

                 IF “%%c”==“en”    Echo %%a^|%%b^|%%c^|%%d >>official.txt

               )

IF “%%b”==“2”  ( IF “%%c”==“x-jat” Echo %%a^|%%b^|%%c^|%%d >>Synonyms_jp.txt

                 IF “%%c”==“en”    Echo %%a^|%%b^|%%c^|%%d >>Synonyms_en.txt

               )

)
Echo Files content processed

pause

Going through docs at http://dev.plexapp.com/docs/ to see if there is an easy way for loading flat files...

how about including it in source codes as an array ? i would be ok with a straight string comparison if using primary, official, then synonyms.

Looking at the processed Anidb data after the batch:

. 305KB Primary: romaji titles (Amaenaide yo!! , Aa! Megami-sama! Sorezore no Tsubasa )

. 081KB Official: english less appealing for me (Ah My Buddha, Ah! My Goddess: Flights of Fancy )

. 158KB Synonyms english (Ah! My Buddha, Ah! My Goddess: Everyone Has Wings, Ah! My Goddess Everyone`s Wings )

. 076KB Synonyms romaji (Amaenaideyo!!, Ah!! My Goddess: Sorezore no Hane, Ah! My Goddess: Sorezore no Tsubasa )

I will try to understand the source so i can modify it to do a strict search on the order primary => official => synonyms.so by modifying the folder name you can reach 100% accuracy (if you rename your folders or named them properly) but i've never done python so i take it slow... Using the "aid" number, we could match with the xml mapping file exactly the right tvdb entry and pull episodes synopsys since equivalence tables are included...

I hope that helps you moving forward, surely the amount of data is much smaller and could nearly be included in the python script (will try that temporarily but am not sure i can make it work)

Atomicstrawberry · July 26, 2013, 2:03pm

Including it directly in the script means manually updating every week or so to make sure we have the latest titles. Additionally the search time will likely be pretty slow. If you want to add it in then go ahead, but I'm finding that with the fixes I mentioned I'm testing, I'm getting pretty decent match rates against TVDB. It's usually anidb matching that has issues, and in those rare cases it's easy to manually tell the agent which one to match (custom search with 'aid:')

If you want to add in some kind of offline local search thing or something then I'm happy for you to modify the source - I plan to get it into git before I do the next release so once I do so, feel free to make a fork. I think I'd still rather investigate methods for loading 'override' data / prefs from an XML in the series folder and using searches to do the matching, as this requires vastly less effort from me to maintain long-term :)

ZeroQI · July 26, 2013, 3:49pm

he ~~workaround~~ functionality work so beautifully :D

I am just frustrated at times i search for a serie, the name is correct but doesn't find it, yet with aid:xxxx it show me the exact string i have searched, and i want to fix that by any means... but putting in the source would be dirty i admit but can't find how to read flat files

Exemple: http://anidb.net/a943 Yokoyama Mitsuteru Sangokushi or "Romance of the Three Kingdoms". while the english title isn't bad it's a synonym

I corrected the code to consider the first synonym IF no title was found in the language considered.

Analysing hte code i was afraid of the main title being overwritten by the official one. For the serie in question here, the synonym was the only English title available. After correction It displayed the title correctly in the search window instead of noithing found, but the serie display name is the main one as expected after metadata download

for title in titles:
      titleType = title.get('type')
      titleLang = title.get('lang')
      if titleType in ['main', 'official', 'syn']:
        if titleLang == "en" and englishTitle == "":
          englishTitle = title.text
        elif titleLang == "x-jat" and romajiTitle == "":
          romajiTitle = title.text
        elif titleLang == 'ja' and kanjiTitle == "":
          kanjiTitle = title.text
    #assuming list in order main official syn
    #main title language english doesn't exist apart
    #4552|1|en|Whatever you searched for is NOT anime... 
    #9899|1|en|Ninja Hattori-kun (2012)

I want no other file the anime media folders itself but if i did, it would be a 680x1000 poster file with the aid as filename and in the config file the preference for primary (romaji) or official (english)...

Loading an xml file with the series titles and the anidb/tvdb mapping of course would be ideal for the search and would be what i would do if feasible.. After reflection, we could put it at the root of the web folder to look at it locally, as the functions are already there to load it.... Is that what you meant ? long term, that would allow to pick data from both sources and avoid the shortfalls of thetvdb without the dependance on your server, but i want to know the possible local performance impact and minimise the load on the server if possible...

I want the folder to match one of the entries (including synonyms) and if not, the folder has to be renamed.

That allow to import with no human interaction... Good continuation and i keep you posted if i make any other progress

ZeroQI · July 27, 2013, 6:05am

title = re.sub(r'[()\.~+\-_!,:&]', ' ', title)

i think ther is twice "\". Should that be "/" ?

I noticed a bug, when the search engine find a korean serie with no ■■■ title, it show no results

The solution is to display the main title in that case. also, have added synonyms title to the detection, without naming them. thinking about adding the shorts since they are unique and nome (mai otome zwei) do a perfect match since

I have re-written this function to show the aid, the main title and the other supported languages, the code should withstand changes in the languages...

  def getMatchedTitles(self, titles):
    allTitles      = []
    officialTitles = ['' for i in range(len(LANGUAGE_PRIORITY)+1)]
    displayTitle   = ""
#Fill official titles (and main title at end of variable), add all (but short titles and duplicates) to allTitles
for title in titles:
  if title.get('type') == 'main':
    officialTitles [ len(LANGUAGE_PRIORITY) ] = title.text
  if title.get('type') == 'official' and title.get('lang') in LANGUAGE_PRIORITY:
    officialTitles [ LANGUAGE_PRIORITY.index( title.get('lang') ) ] = title.text
  if not title.get('type') == 'short' and not title.text in allTitles:
    allTitles.append(title.text)

#Eliminate duplicates between official titles and main titles
for x in range(0, len(LANGUAGE_PRIORITY)):
  for y in range(x+1, len(LANGUAGE_PRIORITY)+1):
     if officialTitles [ x ] == officialTitles [ y ]:
       officialTitles [ x ] = ''
       break

#Create the display string
displayTitle = aid.zfill(5) + ': ' + officialTitles [ len(LANGUAGE_PRIORITY) ]
for language in LANGUAGE_PRIORITY:
  if not officialTitles [LANGUAGE_PRIORITY.index(language)] == '':
    displayTitle += '/' + officialTitles [LANGUAGE_PRIORITY.index(language)]
    #break
    
#Log badly named folders (not main title, not official) in logs, to do...
if displayTitle == '':
  Log('Error in getMatchedTitles, empty title while processing aid: ' + aid)
  raise ValueError

return displayTitle, allTitles

"global aid" need to be used before declaring aid for it to work

ZeroQI · July 31, 2013, 1:04am

Am trying to work on this plugin. managed to do the local exact lookup but code is a bit crude

Here is what i will try

. Local exact lookup in anime-titles.xml then web lookup if no exact match ()as i had some anime not found with the search URL despite perfect title (Last exile)

. Load anidb title XML to be loaded locally instad of a URL

. update automatically when outdated OR update every set period(2 week, since anime tend to be put 2 week before airing) automatically...

. use anime-list-full.xml to recover posters (step 1) and then episode info (step 2) if the anime id+thetvdb id are there, if only the anidb is there don't look for pics, if none are there, use the current way of pulling pics?

The file can be seen here with explanations ans series left ot check http://forum.xbmc.org/showthread.php?tid=64587&pid=1199733#pid1199733xml

If more people use it, there would be more visibility and opportunities hopefully for it to be updated more...

I have added the AID in the search box results, to be able to check results better. wil release once stable, but do not want to spoil Atomicstrawberry's thread... Please let me know what you think Atomicstrawberry, sent you a PM.

Edit: managed to do a strict search on all titles, and if fails, then split the title into keywords and search all that have at least one, and let the levenchtein distance work wonders (after removing spaces and forbidden characters) so if you split a word and shouldn't have, it won't matter...

I am forced however to load through a URL though for the moment the anidb xml. apparently you can use a file in the Resources directory with R(filename) but failed to make it work...

Atomicstrawberry · July 31, 2013, 9:27am

Am trying to work on this plugin. managed to do the local exact lookup but code is a bit crude

Here is what i will try

. Local exact lookup in anime-titles.xml then web lookup if no exact match ()as i had some anime not found with the search URL despite perfect title (Last exile)

I think this is actually a bug in my search, because if I search manually on the search engine site for Last Exile it works fine but I have encountered this exact issue (or more accurately, not matching Fam, The Silver Wing). I think I fixed it in my version but I've got some other stuff to clean up there (including an abortive attempt to map in cast information which doesn't seem to actually work properly in Plex or I'm not doing it right) before releasing.

. modify that to be local only and update every set period(1 week, since anime tend to be put 2 week before airing) automatically...

Assuming you're using the Plex HTTP API to pull the file down locally, you can actually set it up to pull the XML file at most once a week, otherwise keep it cached. So no extra effort. If you check out my code you'll see that it caches the AniDB results for 24 hours for example. Any more frequent than that can actually get your client banned so please avoid doing that. :)

. use anime-list-full.xml to recover artwork (step 1) then episode info (step 2) if the anime id+thetvdb id are there, if only the anidb is there don't look for pics, if none are there, use the current way of pulling pics.

I caution you to be careful when using TVDB episode info - the only thing you get there that isn't on AniDB is episode summaries, and because of the differences in the way TVDB indexes shows by season and so on there are many cases where you'll assign the wrong info to the wrong episode.

The file can be seen here with explanations ans series left ot check http://forum.xbmc.org/showthread.php?tid=64587&pid=1199733#pid1199733xml

If more people use it, there would be more visibility and opportunities hopefully for it to be updated more...

I have added the AID in the search box results, to be able to check results better. wil release once stable, but do not want to spoil Atomicstrawberry's thread... Please let me know what you think Atomicstrawberry, sent you a PM.

I still think that a better approach is to provide an extra 'info' file alongside the series folder and use that to hint info, rather than having to parse a huge XML file (worried about performance mainly). Perhaps once you've got your stuff working it can get incorporated in as an option.

Honestly it sounds more like you just want to use the TVDB search on its own. What are you actually wanting to get from AniDB? It might be you just basically want to extend the TVDB agent with a different search / match function that sets the title differently.

ZeroQI · July 31, 2013, 10:24pm

Hi thanks for the answer. What i want is roughtly anidb info and tags, the covers from thetvdb, the episode info from it [when in the xml mapping file allows].

But i want foremost biggest amount of recognised information, which for me starts with a correct title which can be one of several available.

I want to use Anidb as it is the most complete and up to date, with nice tags and the only with xml database availablethat i know of...

I have a working local search, and it solves most issues. i have noticed with the values or recommended series, and i limited the characters removed to those not allowed in filenames and the tilde for exemple, as [!-] are fairly used in anime names... please check it, i will PM it to you now,..

ZeroQI · August 2, 2013, 4:14pm

Hi Atomic Strawberry. I sent three PM few messages but got no reply... Here the latest version of my fork, with working local lookup (both XML files go in Hama.bundle/Contents/ressources) and no internet query at all for the search part. There is also anidb to tvdb mapping for fanart, posters, banners, should support multiple languages through language_order variable. Since it maps directly thetvdb, there is much more posters showing up now... I have a good folder recognition accuracy now and to me the speed is good, as i optimised few times the code..

Please review it, you will notice i axed few fonctions and tried to keep the code readable and simple...

i passed more than a full week coding, but as i am not fluent in python, there is few thinks i miss:

. How to write in resources folder so i can update the XMLfiles

. How to decompress gz files

###############################################################################################
# Plex HTTP Anidb Metadata Agent (HAMA) plugin v0.4 By Atomicstrawberry                       #
#     http://forums.plexapp.com/index.php/topic/66918-release-http-anidb-metadata-agent-hama/ #
# Forked by ZeroQI v4.0 2013-08-04                                                            #
###############################################################################################

  ### Functionality change
  #   . Local (anime-titles.xml) Title XML Title & Keyword parsing: search main title or any language in the language order.
  #   . Local (anime-list-full.xml) theTVDB lookup table through mapping XML, WAY more covers (no more title search)
  #   . Versatile search: can manage colon replaced by double tilde, split words incorrectly, with missing dash without renaming folder
  #   . International language support, not only english.
  #   . Reduced the number of functions and improved overall source code clarity
  #
  ### Bug improvement
  #   . Fork Commas in "DefaultPrefs.json" prevented to see settings in Settings > Tab: Plex Media Server > Sidebar: Agents > Tab: Movies/TV Shows > Tab HamaTV
  #   . Search page couldn't find some anime despite exact title (could be gotten around using leading backslash '\').
  #
  ### Roadmap
  #   . To Do: gettvdbId with Local xml lookup table
  #               - "studio" metadata update
  #               - Episode mapping translation
  #   . To Do: Local file lookup
  #               - "aidxxxxx.jpg" cover with AnimeID as title
  #               - "AniDB.ID"     file containing id
  #               - "AniDB.NFO"    containing XML from anidb. when refreshing it or when otion enabled save it localy ?

  ### Flow of information and functions
  #
  #searchByName   (results, lang,   origTitle, year )
  #    xmlElementFromFile(ANIDB_ANIME_TITLES, ANIDB_ANIME_TITLES_URL)
  #  ###aid:xxxx
  #    getMainTitle      (titles)
  #  ###Exact match
  #    cleanse_title     (origTitle)
  #    getMainTitle      (titles)
  #  ###keyword match
  #    splitByChars      (origTitle, SPLIT_CHARS)
  #    cleanse_title     (origTitle)
  #    getScore          (a, b)
  #    getMainTitle      (titles)
  #
  #parseAnimeXml
  #    getResultFromAnidb(ANIDB_HTTP_API_URL + animeID)
  #    searchInTVDB      (self, metadata):
  #    getImagesFromTVDB (metadata, tvdbSeriesId)

### Preferences declared in "DefaultPrefs.json", accessible in Settings > Tab: Plex Media Server > Sidebar: Agents > Tab: Movies/TV Shows > Tab HamaTV ###
#Prefs['GetTvdbFanart'    ] "id": "GetTvdbFanart",     "label": "Attempt to fetch Fanart images from TVDB",               "type": "bool", "default": "true"
#Prefs['GetTvdbPosters'   ] "id": "GetTvdbPosters",    "label": "Attempt to fetch Poster images from TVDB",               "type": "bool", "default": "true"
#Prefs['GetTvdbBanners'   ] "id": "GetTvdbBanners",    "label": "Attempt to fetch series Banner images from TVDB",        "type": "bool", "default": "true"
#Prefs['PreferAnidbPoster'] "id": "PreferAnidbPoster", "label": "Prefer AniDB's poster over TVDB if TVDB lookup enabled", "type": "bool", "default": "false"
##Prefs['
##Prefs['

import os, os.path, re, time, datetime
# Functions used per module: os (read), re (sub, match), time (sleep), datetim (datetime).
# Unused modules: urllib, types, hashlib , unicodedata, Stack, utils

networkLock = Thread.Lock()

LANGUAGE_PRIORITY         = [ 'x-jat', 'en']
FILTER_CHARS              = "\\/:*?<>|~- "
SPLIT_CHARS               = ";,.~-" #Space is implied
HTTP.CacheTime            = CACHE_1HOUR * 26
SECONDS_BETWEEN_REQUESTS  = 2

### AniDB and TVDB URL and path variable definition,  ##################################################################################################################
ANIDB_DOMAIN              = 'anidb.net'
ANIDB_HTTP_API_PORT       = '9001'
ANIDB_HTTP_API_BASE_URL   = 'http://api.%s:%s' % (ANIDB_DOMAIN, ANIDB_HTTP_API_PORT)
ANIDB_HTTP_API_URL        = '%s/httpapi?request=anime&client=hama&clientver=1&protover=1&aid=' % (ANIDB_HTTP_API_BASE_URL)
ANIDB_SEARCH_URL          = 'http://anidbsearch.pinkubentobox.com/index.php?task=search&query=' #Similar to http://anisearch.outrance.pl/ ?
ANIDB_PIC_BASE_URL        = 'http://img7.anidb.net/pics/anime/'
ANIDB_HTTP_CLIENT_NAME    = 'hama'
ANIDB_HTTP_CLIENT_VER     = '1'

ANIDB_ANIME_TITLES_URL    = 'http://127.0.0.1/anime-titles.xml' #'http://anidb.net/api/anime-titles.xml.gz'
ANIDB_ANIME_TITLES        = 'anime-titles.xml'
ANIDB_MAPPING             = 'anime-list-full.xml'
ANIDB_MAPPING_URL         = 'http://127.0.0.1/anime-list-full.xml' #'https://raw.github.com/ScudLee/anime-lists/master/anime-list-full.xml'
ANIDB_MAPPING_URL2        = 'https://raw.github.com/ScudLee/anime-lists/master/anime-list.xml'

TVDB_API_KEY              = 'A27AD9BE0DA63333'   #http://thetvdb.com/?tab=apiregister
TVDB_SEARCH_URL           = 'http://thetvdb.com/api/GetSeries.php?seriesname=%s&language=en'
TVDB_BANNERS_URL          = 'http://thetvdb.com/api/%s/series/%%s/banners.xml' % TVDB_API_KEY
TVDB_IMAGES_URL           = 'http://thetvdb.com/banners/%s'

### List of AniDB category names useful as genre. 1st variable mark 18+ categories. The 2nd variable will actually cause a flag to appear in Plex ######################
RESTRICTED_GENRE_NAMES    = [ '18 Restricted', 'Pornography' ]
RESTRICTED_CONTENT_RATING = "NC-17"
GENRE_NAMES               = [

  ### Audience categories - all useful but not used often ############################################################################################################
  'Josei', 'Kodomo', 'Mina', 'Seinen', 'Shoujo', 'Shounen',
  
  ### Elements - many useful #########################################################################################################################################
  'Action', 'Martial Arts', 'Swordplay', 'Adventure', 'Angst', 'Anthropomorphism', 'Comedy', 'Parody', 'Slapstick', 'Super Deformed', 'Detective', 'Ecchi', 'Fantasy',
  'Contemporary Fantasy', 'Dark Fantasy', 'Ghost', 'High Fantasy', 'Magic', 'Vampire', 'Zombie', 'Harem', 'Reverse Harem', 'Henshin', 'Horror', 'Incest',
  'Mahou Shoujo', 'Pornography', 'Yaoi', 'Yuri', 'Romance', 'Love Polygon', 'Shoujo Ai', 'Shounen Ai', 'Sci-Fi', 'Alien', 'Mecha', 'Space Travel', 'Time Travel', 
  'Thriller', 'Western',                                             
      
  ### Fetishes. Leaving out most porn genres #########################################################################################################################
  'Futanari', 'Lolicon', 'Shotacon', 'Tentacle', 'Trap', 'Reverse Trap',
  
  ### Original Work - mainly useful ##################################################################################################################################
  'Game', 'Action Game', 'Dating Sim - Visual Novel', 'Erotic Game', 'RPG', 'Manga', '4-koma', 'Movie', 'Novel',
  
  ### Setting - most of the places aren't genres, some Time stuff is useful ##########################################################################################
  'Fantasy World', 'Parallel Universe', 'Virtual Reality', 'Hell', 'Space', 'Mars', 'Space Colony', 'Shipboard', 'Alternative Universe', 'Past', 'Present', 'Future',
  'Historical', '1920s', 'Bakumatsu - Meiji Period', 'Edo Period', 'Heian Period', 'Sengoku Period', 'Victorian Period', 'World War I', 'World War II',
  'Alternative Present',
  
  ### Themes - many useful ###########################################################################################################################################
  'Anti-War', 'Art', 'Music', 'Band', 'Idol', 'Photography', 'Christmas', 'Coming of Age', 'Conspiracy', 'Cooking', 'Cosplay', 'Cyberpunk', 'Daily Life', 'Earthquake',
  'Post-War', 'Post-apocalypse', 'War', 'Dystopia', 'Friendship', 'Law and Order', 'Cops', 'Special Squads', 'Military', 'Airforce', 'Feudal Warfare', 'Navy',
  'Politics', 'Proxy Battles', 'Racism', 'Religion', 'School Life', 'All-boys School', 'All-girls School', 'Art School', 'Clubs', 'College', 'Delinquents', 
  'Elementary School', 'High School', 'School Dormitory', 'Student Council', 'Transfer Student', 'Sports', 'Acrobatics', 'Archery', 'Badminton', 'Baseball', 
  'Basketball', 'Board Games', 'Chess', 'Go', 'Mahjong', 'Shougi', 'Combat', 'Boxing', 'Judo', 'Kendo', 'Muay Thai', 'Wrestling', 'Cycling', 'Dodgeball', 'Fishing',
  'Football', 'Golf', 'Gymnastics', 'Horse Riding', 'Ice Skating', 'Inline Skating', 'Motorsport', 'Formula Racing', 'Street Racing', 'Rugby', 'Swimming', 'Tennis',
  'Track and Field', 'Volleyball', 'Steampunk', 'Summer Festival', 'Tragedy', 'Underworld', 'Assassin', 'Bounty Hunter', 'Mafia', 'Yakuza', 'Pirate', 'Terrorist',
  'Thief'
]

### These are words which cause extra noise due to being uninteresting for doing searches on ###########################################################################
FILTER_SEARCH_WORDS = [                                                                                          # Lowercase only
    'a',  'of', 'an', 'the', 'motion', 'picture', 'special', 'oav', 'tv', 'special', 'eternal', 'final', 'last', # En particles
    'to', 'wa', 'ga', 'no',                                                                                      # Jp particles
    'le', 'la', 'un', 'les', 'nos', 'vos', 'des', 'ses'                                                          # Fr particles
]

### main metadata agent ################################################################################################################################################
class HamaCommonAgent:

  ### Local XML AniDB lookup ###
  def searchByName(self, results, lang, origTitle, year=""):
  
    Log.Debug("SearchByName (%s,%s,%s,%s)" % (results, lang, origTitle, str(year) ))
    tree = self.xmlElementFromFile(ANIDB_ANIME_TITLES, ANIDB_ANIME_TITLES_URL)                          #Get the xml title file into a tree, #from lxml import etree #doc = etree.parse('content-sample.xml')
    Log.Debug("SearchByName - %s loaded" % ANIDB_ANIME_TITLES)                                          #Check logs to see loading time
    
    ### aid:xxxxx Fetch the exact serie XML form AniDB.net (Caching it) from the anime-id ###
    if origTitle.startswith('aid:'):
      animeId              = str(origTitle[4:])                                                         #Get string after "aid:" which is 4 characters
      Log.Debug( "SearchByName - aid: %s" % animeId)
      langTitle, mainTitle = self.getMainTitle( tree.xpath("/animetitles/anime[@aid='%s']/*" % animeId) )              #extract titles from the Anime XML element tree directly #1: langTitle, mainTitle = self.getMainTitle(self.getResultFromAnidb( ANIDB_HTTP_API_URL + animeId ).xpath('/anime/titles/title'))     
      Log.Debug( "SearchByName - aid: %s %s (%s)" % (animeId, langTitle, mainTitle) )
      results.Append(MetadataSearchResult(id=animeId, name=langTitle, year=None, lang=Locale.Language.English, score=100))
      return 
    
    ### Local exact search ###
    Log('SearchByName - XML exact search - Trying to match: ' + origTitle)
    cleansedTitle = self.cleanse_title (origTitle)
    elements      = list(tree.iterdescendants())    #from lxml import etree; tree = etree.parse(ANIDB_ANIME_TITLES) folder missing?; #To-Do: Save to local (media OR cache-type folder) XML???  
    for title in elements:
      if title.get('aid'):                                                                    #is an anime tag (not title tag) in that case ###
        aid = title.get('aid')
      else:
        if title.get('{http://www.w3.org/XML/1998/namespace}lang') in LANGUAGE_PRIORITY or title.get('type')=='main':
          sample = self.cleanse_title (title.text)
          if cleansedTitle == sample :                                                        #Should i add "origTitle.lower()==title.text.lower() or" ??
            Log.Debug("SearchByName - Local exact search - aid: %s %s" % (aid,title.text))
            langTitle, mainTitle = self.getMainTitle(title.getparent())                       #Title according language order selection instead of main title
            results.Append(MetadataSearchResult(id=aid, name=langTitle, year=None, lang=Locale.Language.English, score=100))
            return
    
    ### local keyword search ###
    matchedTitles  = [ ]
    words          = [ ]
    temp           = ""
    for word in self.splitByChars(origTitle, SPLIT_CHARS):
      word = self.cleanse_title (word)
      if not word=="" and word not in FILTER_SEARCH_WORDS:                                    #Special characters scrubbed result in empty word matching all
        words.append (word)
        temp += "'%s', " % word
    Log.Debug("SearchByName - XML Keyword search -  Trying to match: '%s' with Keywords: %s " % (origTitle, temp) )

    if len(words)==0 or len( self.splitByChars(origTitle, SPLIT_CHARS) )==1 :                 #single work title so already tested
      return None
    
    for title in elements:
      if title.get('aid'):                                                                    #is an anime tag in that case
        aid = title.get('aid')
      else: 
        if title.get('{http://www.w3.org/XML/1998/namespace}lang') in LANGUAGE_PRIORITY or title.get('type')=='main':
          sample = self.cleanse_title (title.text)
          for word in words:
            if word in sample:
              Log.Debug("SearchByName - XML Keyword search - keyword '%s' matched '%s'" % (word, sample) )
              index  = len(matchedTitles)-1
              if index >=0 and matchedTitles[index][0] == aid:                                #if same serie id
                if title.get('type') == 'main':
                  matchedTitles[index][1] = aid.zfill(5) + ' ' + title.text                   #use main title as display title as it pass the match as well
                if not title.text in matchedTitles[index][2]:                                 #if title not already added
                  matchedTitles[index][2].append(title.text)                                  #append title to allTitles list
              else:
                matchedTitles.append([aid, aid.zfill(5) + ' ' + title.text, [title.text] ])   #new insertion (not necessarily main title)
    
    if len(matchedTitles)==0:
      return None

    ### calculate scores + Buid results ###
    for match in matchedTitles:
      scores = []
      for title in match[2]:     #Calculate distance without space and characters not allowed for files (added tilde when used WRONGLY as separator like MIKE)
        scores.append(self.getScore( self.cleanse_title(title), cleansedTitle ))
        Log.Debug("SearchByName - Score: " + match[1] + ' for ' + match[0] + ' ' + title )
      bestScore = max(scores)
      results.Append(MetadataSearchResult(id=match[0], name=match[1], year=None, lang=Locale.Language.English, score=bestScore))
    results.Sort('score', descending=True)
	
  ### Import XML file from 'Resources' folder into an XML element ######################################################################################################
  def xmlElementFromFile (self, filename, url):
    if not ""=="": #if not up to date...
      string = XML.ElementFromURL(ANIDB_ANIME_TITLES_URL, cacheTime=CACHE_1HOUR * 24 * 7);
    else:
      Log.Debug('xmlElementFromFile (%s, %s) %s' % (filename, url, R(filename) ) )
      string = Resource.Load(filename) #no option to write
    return XML.ElementFromString(string) 

  ### Get the Levenshtein distance score in percent between two strings ################################################################################################
  def getScore(self, a, b):
    return int(100 - (100 * float(Util.LevenshteinDistance(a,b)) / float(max(len(a),len(b))) ))
    #To-Do: LongestCommonSubstring(first, second) use that to modify results???

  ### Get the Levenshtein distance score between two strings in percent ################################################################################################
  def cleanse_title(self, title):
    return title.translate(None, FILTER_CHARS).lower()

  ### Split a string per list of chhars ################################################################################################################################
  def splitByChars(self, string, separators=SPLIT_CHARS):
    for i in range(len(separators)):
      if string* in separators:
        string*= ' '
    return string.split()

  ### Pull down the XML for a given anime ID. Don't worry about caching, the HTTP system does that #####################################################################
  def getResultFromAnidb(self, url):
    global lastRequestTime
    lastRequestTime = datetime.datetime.utcfromtimestamp(0);
    try:
      networkLock.acquire()
      tries = 2
      while tries:
        delta = datetime.datetime.utcnow() - lastRequestTime;
        if delta.seconds < SECONDS_BETWEEN_REQUESTS:                  #according to their documentation, requests closer than 2 secs apart will get you banned. :(
          time.sleep(SECONDS_BETWEEN_REQUESTS - delta.seconds)
        result = None
        try:
          lastRequestTime = datetime.datetime.utcnow()
          result          = HTTP.Request(url, headers={'Accept-Encoding':''}, timeout=60)
        except Exception, e:
          Log("getResultFromAnidb(" + url + ") - Error: " + e.code)
          return None;
        if result != None:
          return XML.ElementFromString(result);
        tries -= 1
  
    finally:
      networkLock.release()
      
    return None;

  ### extract the series / movie title #################################################################################################################################
  def getMainTitle(self, titles):
    
    Log.Debug ("getMainTitle (%d titles)" % len(titles) )
    officialTitles = ["" for i in range(len(LANGUAGE_PRIORITY)+2)] #title order as per LANGUAGE_PRIORITY, then main title choosen (index len(LANGUAGE_PRIORITY)), then original title (index len(LANGUAGE_PRIORITY)+1)
    for title in titles:
      titleLang = title.get('{http://www.w3.org/XML/1998/namespace}lang')   # some wierdness caused by the fact the XML has 'xml:lang' attrib here #
      titleType = title.get('type')
      #Log.Debug("getMainTitle - Type: " + titleType + " Lang: " + titleLang + " Title: " + title.text)
      
      if title.get('type') == 'main':
        officialTitles [len(LANGUAGE_PRIORITY)+1] = title.text
        if titleLang in LANGUAGE_PRIORITY:
          officialTitles [LANGUAGE_PRIORITY.index( titleLang )] = title.text
      
      if title.get('type') == 'official' and titleLang in LANGUAGE_PRIORITY:
        officialTitles [LANGUAGE_PRIORITY.index( titleLang )] = title.text
    for language in LANGUAGE_PRIORITY:
      if not officialTitles [LANGUAGE_PRIORITY.index(language)] == '' :
        officialTitles [len(LANGUAGE_PRIORITY)] = officialTitles [LANGUAGE_PRIORITY.index(language)]
        break
      
    if officialTitles [len(LANGUAGE_PRIORITY)] == '':
      if not officialTitles [ len(LANGUAGE_PRIORITY)+1 ]:
        Log.Debug('getMainTitle - Probably didn t gather /anime/title/title')
        raise ValueError
      else:
        officialTitles [ len(LANGUAGE_PRIORITY) ] = officialTitles [ len(LANGUAGE_PRIORITY)+1 ] #default to main title 
    Log.Debug('getMainTitle - Return main: ' + officialTitles[len(LANGUAGE_PRIORITY)] + ' OrigTitle: ' + officialTitles[len(LANGUAGE_PRIORITY)+1] )
    return officialTitles[len(LANGUAGE_PRIORITY)], officialTitles[len(LANGUAGE_PRIORITY)+1]

  ### Parse the AniDB anime XML ########################################################################################################################################
  def parseAnimeXml(self, metadata, media, force, movie):
    
    Log.Debug("parseAnimeXml (%s, %s, %s, %s)" % (metadata, media, force, movie) )
    anime          = self.getResultFromAnidb( ANIDB_HTTP_API_URL + metadata.id ).xpath('/anime')[0] ### Pull down the XML for a given anime ID. Don't worry about caching, the HTTP system does that ###
    getElementText = lambda el, xp : el.xpath(xp)[0].text if el.xpath(xp)[0].text else ""   # helper for getting text from XML element
    
    ### Posters
    preferAnidbPoster = Prefs['PreferAnidbPoster'];
    picture = getElementText(anime, 'picture')
    if picture != "":
      posterUrl                   = ANIDB_PIC_BASE_URL + picture;
      metadata.posters[posterUrl] = Proxy.Media(HTTP.Request(posterUrl).content, sort_order=(1 if preferAnidbPoster else 99))
      Log("parseAnimeXml - Posters - Getting AniDB picture from url: %s", posterUrl)
    
    ### TheTVDB lookup for posters, fanarts, banners and episode info ###
    getPosters = Prefs['GetTvdbPosters'];
    getFanart  = Prefs['GetTvdbFanart' ];
    getBanners = Prefs['GetTvdbBanners'];
    if (getPosters or getFanart or getBanners) and not movie and not restrictedContent:   # TVDB doesn't index 18+ anime or movies
      tvdbSeriesId = self.getTvdbIdFromAnimeId(metadata.id)                               # Search for the TVDB ID
      Log.Debug("parseAnimeXml - TVDB - Series Id: %s" % tvdbSeriesId);
      self.getImagesFromTVDB(metadata, tvdbSeriesId)                                      # getImagesFromTVDB

    ### Start date ###
    startDate      = getElementText(anime, 'startdate')                                     # get start date if any
    Log.Debug("Start Date: %s" % startDate)
    if startDate != "":
      metadata.originally_available_at = Datetime.ParseDate(startDate).date()
      if movie:
        metadata.year = metadata.originally_available_at.year
    else:
      metadata.originally_available_at = None

    ### Title ###
    try:
      main, orig = self.getMainTitle(anime.xpath('/anime/titles/title'))
      metadata.title          = main
      #metadata.sort_title     = orig   #http://forums.plexapp.com/index.php/topic/25584-setting-metadata-original-title-and-sort-title-still-not-possible/
    except:
      Log.Debug("parseAnimeXml - No title")
      raise ValueError
    Log.Debug("parseAnimeXml - Chosen title: '%s' original title: '%s'" % (metadata.title, metadata.original_title))

    ### Summary ###
    try:                                                                     
      metadata.summary = re.sub(r'http://anidb\.net/[a-z]{2}[0-9]+ \[(.+?)\]', r'\1', getElementText(anime, 'description')) #- Remove wiki-style links to staff, characters etc
    except Exception, e:
      Log.Debug("Exception: " + str(e))
      pass
    
    ### Ratings
    rating = getElementText(anime, 'ratings/permanent')                       
    if rating:
      metadata.rating = float(rating)
     
    ### Category -> Genre mapping ###
    temp              = ""
    genres            = {}
    restrictedContent = False
    for category in anime.xpath('categories/category'):                         
      weight = category.get('weight')
      name = getElementText(category, 'name')
      if name in GENRE_NAMES:
        genres[name] = int(weight)
        temp += name + " "
      if name in RESTRICTED_GENRE_NAMES:
        Log.Debug(name + " in restricted genres, marking as 18+")
        restrictedContent       = True
        temp                   += name + "(18+) "
    sortedGenres = sorted(genres.items(), key=lambda x: x[1],  reverse=True)   # sort genre list
    if len(sortedGenres) > 5:                                                  # take top 5 only [To Do: Ask why]
      del sortedGenres[5:]
    
    temp = ""
    metadata.genres.clear()
    for genre in sortedGenres:
      metadata.genres.add(genre[0])
      temp += "%s (%s) " % (genre[0], str(genre[1])) 
    Log.Debug("parseAnimeXml - Categories - Genres (Weight): " + temp)
      
    if restrictedContent:
      metadata.content_rating = RESTRICTED_CONTENT_RATING
    
    ### Creator data  Aside from the animation studio, none of this maps to Series entries, so save it for episodes ###
    writers   = []
    directors = []
    producers = []
    temp      = ""
    for creator in anime.xpath('creators/name'):
      
      nameType = creator.get('type')
      if nameType == "Animation Work":                       # Studio
        metadata.studio = creator.text;
        temp           += "Studio: %s, " % creator.text
        
      if "Direction" in nameType:                            # Direction, Animation Direction, Chief Animation Direction, Chief Direction
        directors.append(creator.text)
        temp           += "%s is director, " % creator.text
        
      if nameType == "Series Composition":                   # Series Composition is basically a producer role 
        producers.append(creator.text)
        temp           += "%s is producer, " % creator.text
        
      if nameType == "Original Work":                        # original creator of whatever this was adapted from might be an artist for manga, but 'writers' is the best we can map to
        writers.append(creator.text)
        temp           += "%s is writer, " % creator.text
        
      if "Script" in nameType or "Screenplay" in nameType:   #Script writer
        writers.append(creator.text)
        temp           += "%s is writer" % creator.text
    Log.Debug("parseAnimeXml - Categories - Creator data: " + temp)

    if movie: ### Movies specific - Movies have these on the top level ###
      metadata.original_title = orig  #http://forums.plexapp.com/index.php/topic/25584-setting-metadata-original-title-and-sort-title-still-not-possible/
      
      metadata.writers.clear()
      for writer in writers:
        metadata.writers.add(writer)

      metadata.producers.clear()
      for producer in producers:
        metadata.producers.add(producer)

      metadata.directors.clear()
      for director in directors:
        metadata.directors.add(director)
    else: ### Season Specific ###
      for season in media.seasons:
        Log.Debug("parseAnimeXml - TVDB - Setting season poster")
        for posterUrl in metadata.posters.keys():
          metadata.seasons[season].posters[posterUrl] = Proxy.Media(HTTP.Request(posterUrl).content, sort_order=(1 if preferAnidbPoster else 99))

      ### Episodes Specific ###
      numEpisodes   = 0;
      totalDuration = 0;
      for episode in anime.xpath('episodes/episode'):
        epNum     = episode.xpath('epno')[0]
        epNumType = epNum.get('type')
        epNumVal  = epNum.text
        Log.Debug("Episode type " + epNumType + "Num: " + epNumVal)
        if epNumType == "1":           # normal episode
          season = 1
        elif epNumType == "2":         # specials are prefixed with S
          season = 0
          if epNumVal[0] == 'S':
            epNumVal = epNumVal[1:]       
          #TODO: map OPs, CMs etc to season 2+?
          #########################################
          #Type         # Episode number          #
          #########################################
          #OPs/EDs  "C" # Season 0 Episodes 101-199
          #Trailers "T" # Season 0 Episodes 201-299
          #Parodies "P" # Season 0 Episodes 301-399
          #Other    "O" # Season 0 Episodes 401-499
          #########################################
        else:
          Log.Debug("Skipping episode as it is not a special or a normal episode.")
          continue

        if not (season in media.seasons and epNumVal in media.seasons[season].episodes):
          Log("parseAnimeXml - Skipping episode " + str(epNumVal) + " in season " + str(season) + " as it is not in the media collection.")
          continue
        episodeObj = metadata.seasons[season].episodes[epNumVal]
      
        ### Writers, etc... ###
        Log.Debug("Setting writers etc.")
        episodeObj.writers.clear()
        for writer in writers:
          episodeObj.writers.add(writer)
        episodeObj.producers.clear()
        for producer in producers:
          episodeObj.producers.add(producer)
        episodeObj.directors.clear()
        for director in directors:
          episodeObj.directors.add(director)
        try:
          rating = getElementText(episode, 'rating')
          if rating != "":
            episodeObj.rating = float(rating)
        except:
          pass # rating is optional
      
        ### Duration ###
        duration = getElementText(episode, 'length')
        if duration != "":
          episodeObj.duration = int(duration) * 60 * 1000  # AniDB stores in minutes, Plex in millisecs
          #Log.Debug("Duration: %s" % str(episodeObj.duration))
          if season == 1:
            numEpisodes += 1
            totalDuration += episodeObj.duration
          
        ### turn the YYYY-MM-DD airdate in each episode into a Date ###
        airdate = getElementText(episode, 'airdate')
        if airdate != "":
          match = re.match("([1-2][0-9]{3})-([0-1][0-9])-([0-3][0-9])", airdate)
          if match:
            try:
              episodeObj.originally_available_at = datetime.date(int(match.group(1)), int(match.group(2)), int(match.group(3)))
              #Log.Debug("Airdate: " + str(episodeObj.originally_available_at))
            except ValueError, e:
              Log.Debug("parseAirDate - Date out of range: " + str(e))
            
        ### Get the correct title ###
        episodeObj.title,temp = self.getMainTitle (episode.xpath('episode/title[@lang]')) ### Works ? to test   episode/title[@lang]
        if not episodeObj.title :
          episodeObj.title  = epNum.text # deliberately using epNum.text as for specials it's still prefixed with S
          #Log.Debug("Episode Title: %s" % episodeObj.title)     for season in media.seasons:
          Log.Debug("parseAnimeXml - TVDB - Setting season poster")
          for posterUrl in metadata.posters.keys():
            metadata.seasons[season].posters[posterUrl] = Proxy.Media(HTTP.Request(posterUrl).content, sort_order=(1 if preferAnidbPoster else 99))

      ### Final post-episode titles cleanup ###
      metadata.duration = int(totalDuration) / int(numEpisodes)     

  ### Get the tvdbId from the AnimeId #######################################################################################################################
  def getTvdbIdFromAnimeId(self, animeId ):
  
    tree = self.xmlElementFromFile(ANIDB_MAPPING, ANIDB_MAPPING_URL)             #treeRoot = XML.ElementFromURL(ANIDB_MAPPING_URL, cacheTime=CACHE_1HOUR * 24 * 7);
    for anime in tree.iterchildren('anime'):                                     #for anime in matches.xpath('/anime-list/anime')
      if animeId == anime.get("anidbid"):
        Log.Debug('gettvdbId('+animeId+') Found tvdbId: ' + anime.get("tvdbid"))
        # defaulttvdbseason = anime.get("defaulttvdbseason")
        #  mapping-list ;1-3;2-4;
        # studio
        return anime.get("tvdbid") # return anime.get("tvdbid"), Studio , [episodes mapping]
    Log.Debug('gettvdbId('+animeId+') found no corresponding tvdbId')
    return None
 
  ### Attempt to get the TVDB's image data #############################################################################################################################
  def getImagesFromTVDB(self, metadata, tvdbSeriesId):

    if tvdbSeriesId is None:
      Log("No TVDB series id")
      return

    # check prefs
    getPosters        = Prefs['GetTvdbPosters'   ];
    getFanart         = Prefs['GetTvdbFanart'    ];
    getBanners        = Prefs['GetTvdbBanners'   ];
    preferAnidbPoster = Prefs['PreferAnidbPoster'];
    Log.Debug('getImagesFromTVDB - Prefs - Get posters/Fanart/Banners: %s/%s/%s  - Prefer AniDB: %s' %(getPosters,getFanart,getBanners,preferAnidbPoster))

    # don't bother with the full zip, all we need is the banners 
    bannersXml = XML.ElementFromURL(TVDB_BANNERS_URL % tvdbSeriesId, cacheTime=(CACHE_1HOUR * 24))
    num = 0
    for banner in bannersXml.xpath('Banner'):
      num += 1
      bannerType = banner.xpath('BannerType')[0].text
      bannerPath = banner.xpath('BannerPath')[0].text
      bannerLang = banner.xpath('Language'  )[0].text

      try:
        bannerType2 = banner.xpath('BannerType2')[0].text
      except:
        bannerType2 = None

      proxyFunc = Proxy.Preview   # thumbnail version is optional for some banners, we always want to use preview if we can!
      try:
        bannerThumb = banner.xpath('ThumbnailPath')[0].text
      except:   # no thumbnail set so fall back to downloading the full image        
        Log.Debug('getImagesFromTVDB - Error getting thumbnail, falling back to full file.')
        bannerThumb = bannerPath 
        proxyFunc   = Proxy.Media

      bannerRealUrl  = TVDB_IMAGES_URL % bannerPath
      bannerThumbUrl = TVDB_IMAGES_URL % bannerThumb

      if bannerLang != 'en':   # this is an english-only metadata agent so skip non-english images.
        continue

      if bannerType == 'fanart' and getFanart and bannerRealUrl not in metadata.art: # download the images

        try: 
          metadata.art[bannerRealUrl] = proxyFunc(HTTP.Request(bannerThumbUrl).content, sort_order=num)
          Log.Debug('getImagesFromTVDB - Got fanart %s' % bannerRealUrl)
        except:
          pass

      elif bannerType == 'poster' and getPosters and bannerRealUrl not in metadata.posters:

        try:    # note: sort order +1 because anidb's poster is default.
          metadata.posters[bannerRealUrl] = proxyFunc(HTTP.Request(bannerThumbUrl).content, sort_order=(num + 1 if preferAnidbPoster else num))
          Log.Debug('getImagesFromTVDB - Got poster %s' % bannerRealUrl)
        except:
          pass

      elif bannerType == 'series' and getBanners and bannerRealUrl not in metadata.banners:

        try:
          metadata.banners[bannerRealUrl] = proxyFunc(HTTP.Request(bannerThumbUrl).content, sort_order=num)
          Log.Debug('getImagesFromTVDB - Got banner %s' % bannerRealUrl)
        except:
          pass

      elif bannerType == 'season' and bannerType2 is not None:
        # because we're using AniDB as our base, every show has one 'season' even if it's a sequel to another show,
        #  so map the season banners from TVDB into the normal poster list and let the user figure it out if they want to tweak!
        if bannerType2 == 'season' and getPosters and bannerRealUrl not in metadata.posters:
          metadata.posters[bannerRealUrl] = proxyFunc(HTTP.Request(bannerThumbUrl).content, sort_order=(num + 1 if preferAnidbPoster else num))
          Log.Debug('Got series poster %s' % bannerRealUrl)
        elif bannerType2 == 'seasonwide' and getBanners and bannerRealUrl not in metadata.banners:
          metadata.banners[bannerRealUrl] = proxyFunc(HTTP.Request(bannerThumbUrl).content, sort_order=num)
      else:
        Log.Debug('getImagesFromTVDB - Skipping banner %s (%s, lang %s) because already have it, or type invalid' % (bannerPath, bannerType, bannerLang))

##########################################################################################################################
# TV-specific agent 
##########################################################################################################################
class HamaTVAgent(Agent.TV_Shows, HamaCommonAgent):
  name             = 'HamaTV'
  languages        = [ Locale.Language.English, ]
  accepts_from     = ['com.plexapp.agents.localmedia', 'com.plexapp.agents.opensubtitles']
  primary_provider = True
  fallback_agent   = False
  contributes_to   = None
  
  def search(self, results, media, lang, manual):
    self.searchByName(results, lang, media.show, media.year)

  def update(self, metadata, media, lang, force):
    self.parseAnimeXml(metadata, media, force, False)

##########################################################################################################################
# Movie-specific agent
##########################################################################################################################
class HamaMovieAgent(Agent.Movies, HamaCommonAgent):
  name             = 'HamaMovies'
  languages        = [ Locale.Language.English, ]
  accepts_from     = ['com.plexapp.agents.localmedia', 'com.plexapp.agents.opensubtitles']
  primary_provider = True
  fallback_agent   = False
  contributes_to   = None

  def search(self, results, media, lang, manual):
    self.searchByName(results, lang, media.name, media.year)
 
  def update(self, metadata, media, lang, force):
    self.parseAnimeXml(metadata, media, force, True)

ZeroQI · August 5, 2013, 7:50am

Hi again. Scudlee did a catalog mapping xml to group anime: https://github.com/ScudLee/anime-lists/raw/master/anime-movieset-list.xml'. Working on that to have the catalog entries filled automatically!!!

The search is now entirely local on my fork as i find the performance pretty good, and animeid to tvdbid mapping is already functional

If anybody know how to write in Resources folder, i am a taker. i am using "string = Resource.Load(filename)" and it works brilliant, but there is no write equivalent, and i struggle to fin plugin folder path variables as well...

Edit: did a bad copy paste and the TVDB poster lookup dissapeared... will edit above post with corrected source when done...

    ### Posters AniDB.net + TheTVDB.com ###
    preferAnidbPoster = Prefs['PreferAnidbPoster'];
    getPosters        = Prefs['GetTvdbPosters'];
    getFanart         = Prefs['GetTvdbFanart' ];
    getBanners        = Prefs['GetTvdbBanners'];
    if getElementText(anime, 'picture') != "":   #If anidb poster exist
      metadata.posters[posterUrl] = Proxy.Media(HTTP.Request( ANIDB_PIC_BASE_URL+getElementText(anime,'picture') ).content, sort_order=(1 if preferAnidbPoster else 99))
      Log("Getting Anidb picture from url: %s", ANIDB_PIC_BASE_URL+getElementText(anime, 'picture'))
    
      if (getPosters or getFanart or getBanners) and not restrictedContent and not movie:   # TVDB doesn't index movies, nor 18+ anime
        tvdbSeriesId = self.anidbTvdbMapping(metadata.id)                                   # Search for the TVDB ID from the animeId
        Log.Debug("parseAnimeXml - TVDB - Doing TVDB lookup with series id: %s" % tvdbSeriesId);
      if tvdbSeriesId == None:
        Log("[anime-list-full.xml] Missing entry. Please update entry for anidb ID: %s on https://github.com/ScudLee/anime-lists/blob/master/anime-list-todo.xml" % metadata.id);
      else:
        self.getImagesFromTVDB(metadata, tvdbSeriesId)                                    # getImagesFromTVDB(self, metadata, tvdbSeriesId):

ZeroQI · August 10, 2013, 10:44am

HI Atomicstrawberry. I have made progress on the fork i did on your source.

To be fair, i am waiting on you to create it on Gii to fork it officially, as i want to respect the astonishing work you did on the source and don't want to spoil your thread... Should i open a thread here with my fork while i wait for yout to create your git repository ?

Here is why i did so far:

New features

. Search part entirely local (anime-titles.xml file located in Hama.bundle\Contents\Resources)

. Matching the theTVDB.com ID from the AniDB.net ID through mapping file (anime-list-full.xml located in Hama.bundle\Contents\Resources)

. using Studio from mapping file as often missing from AniDB.net

. Episode summary downloaded from theTVDB.com in english only

. Separate language order selection for the serie name and episode titles

. Edit 2013-08-13 TheTVDB.com episode link integrated in summary. working on serie link and anidb links now

Improvements

. Changed theTVDB.com picture function to reflect how the XML behaves, removing most un-necessary "thumbnail not available" logs.

. Changed theTVDB.com picture function to put season posters on seasons only

. Reduced the number of functions: searchByName and parseAnimeXml are directly called by the agent now

. Commented source code. I believe it to be hightly readable under notepad now

. Normalised Logging, we can now see all files skipped and all cached queries, there should be the extreme minimum network request possible...

. imported movie bolean from AniDB.net xml

. Commented some file formats in the source for clarity when reading

Bugs

. Anidb poster no longer downloading systematically

. Changed DefaultPrefs.json (couldn't open settings)

Topic		Replies	Views
[Rel] HTTP Anidb Metadata Agent (HAMA) Dev/API Corner scanner-agent-dev	2122	71254	July 28, 2023
AniDB metadata agent Dev/API Corner scanner-agent-dev	321	19922	February 6, 2020
[Release] MyAnimeList.net Metadata Agent Dev/API Corner scanner-agent-dev	578	58018	August 4, 2021
[Release] [Alpha] [v0.1.3][v0.1.4 demo-7] Plex Anime Multi Source Agent (AMSA) Dev/API Corner scanner-agent-dev	54	3279	January 8, 2020
New Plex Media Server TV show/series scanner and agent preview Metadata & Adding Files feature-preview	466	28294	June 14, 2021

[Release] HTTP Anidb Metadata Agent (HAMA)

Related topics