Trying to get a very basic Scanner/MetaData-Agent setup to work

… so I didn't know in what subforum to post it.. ;)
Hi,

I have some media that I would like to organise as fake "TV Shows" and to which I'd like to retrieve the (easily available) meta-data. (Each file has a 4-digit ID in its name which I can use as paramter for an URL to load a website where I can parse all the data I need.)
I studied many of the existing Python-scripts and got some basic understanding of how it works. I'm not too fluent in Python, but have no problem reading and understanding it.
What I was not able to achieve so far, however, is an understanding of the way a file takes through PMS in that much detail that would enable me to even know where to begin debugging when my code does not behave as I would expect it from comparing it to the pre-installed agents/scanners - it's just too complex. There seems to be no documentation of anything that has to do with the process, the objects I need work with, the methods I have to implement and so on.

So, in a nutshell, what I was trying to achieve so far:
Write a scanner, that simply accepts every file, and then passes the file name on to the meta-data agent. There, after some regexing and website-parsing, I want to set episode title, season and episode number, and maybe description and thumbnail.

Neither scanner nor meta-data agent need to be very smart, as they won't have to handle generic content and it is ensured beforehand that every file will have the ID in it's filename, which is everything I need.
From how I understood the scanners/agents I studied so far, it seems to me, that that should be doable in about 30 to 50 lines of code alltogether. But I was not able to get the scanner working - at least I think so. I also was not able to figure out how to "create" a new TV episode object in the meta-data agent and fill it with any sample data just for testing purposes.

Any help would be greatly appreciated!

Welcome, I’m impressed with your enthusiasm to experiment!



The scanners are responsible for looking at files and turning them into media objects. So you’d do something like this to add an episode:



<br />
                tv_show = Media.Episode(show, season, ep, None, year)<br />
                tv_show.parts.append(file)<br />
                mediaList.append(tv_show)<br />




Your best bet is to take an existing scanner and modify it, most likely.

You're absolutely right, there is very little documentation right now, we're working on that!


As long as the community and your level of participation in it makes you feel heard and respected this is much easier to accept than with many other projects. ;)

And indeed I already made quite some progress, funny enough by simply doing something again that I already tried a week ago.. I stripped down the TV Series scanner to this:


import re, os, os.path<br />
import Media, VideoFiles, Stack<br />
<br />
# Look for episodes.<br />
def Scan(path, files, mediaList, subdirs):<br />
  <br />
  # Scan for video files.<br />
  VideoFiles.Scan(path, files, mediaList, subdirs)<br />
  <br />
  # Run the select regexps we allow at the top level.<br />
  for i in files:<br />
    file = os.path.basename(i)<br />
    if 'test' in file:<br />
      <br />
      # Extract data.<br />
      show = 'Test'<br />
      season = 1<br />
      episode = 1<br />
  <br />
      tv_show = Media.Episode(show, season, episode, '', '')<br />
      tv_show.parts.append(i)<br />
      <br />
      mediaList.append(tv_show)<br />
  <br />
  # Stack the results.<br />
  Stack.Scan(path, files, mediaList, subdirs)



Works fine, and from this I'll be easily able to add the stuff I actually want the scanner to do.
In case anyone ever stumbles upon the same problem as I did - this code is looking for any files (in any subfolders) with the string 'test' in their filename and adds this/these files as Show 'Test', S01E01.
Not sure what exactly I did wrong the last time, thinking I might even just have been confused by cached results. Renaming your testfile to something new after every modification to the scanner does help a lot. :D


Now I'll try and see if I can get my TestMetaDataAgent to work as well. One question beforehand - is it possible to modify an episode's season number via meta data agent, or is this set in stone by the scanner?
If so, I'd actually have to start grabbing data from the website which contains the information in the scanner already.


The partitioning of responsibility is that the scanner picks the structural elements (season, episode #) and the metadata agent fills in all the, well, metadata :)

So you cannot have an agent modify season# or episode#, at least not at the moment.

That was by no means a feature request, just figuring out what's possible and how the system works. ;)

My scanner is now working as I want it to. As I said, I am gathering data from the web in there already. I'm using urllib2 and something like

    usock = urllib2.urlopen(myURL)<br />
    myHTML = usock.read()<br />
    usock.close()


as the objects I that I found being used in meta-data agents (HTML/XML with e.g. XML.ElementFromURL().xpath....) are not accessible in scanners - did I assume that correctly or was I doing something wrong?


Now I also got most of the meta-data agent working, i.e. the part where I extract the data I want from the corresponding website.
It seems that I can't get the agent to actually put any meta-data into the system, however. I looked at the TVRage agent and figured that basically I will need this:


def Start():<br />
  HTTP.CacheTime = CACHE_1HOUR #* 24<br />
<br />
class TheTestAgent(Agent.TV_Shows):<br />
  name = 'The Test Agent'<br />
  languages = [Locale.Language.English]<br />
  primary_provider = True<br />
  <br />
  def search(self, results, media, lang):<br />
    results.Append(MetadataSearchResult(id = 'null', score = 100))<br />
          <br />
  def update(self, metadata, media, lang):<br />
    metadata.title = 'test'


Do I need to implement other methods than search and update because it is a primary_provider as opposed to TVRage? I tried stripping down TheTVDB-agent, but failed due to its complexicity.
From reading both of those agents I would have expected my code to set every episode title to 'test'..
The "implementation" of search is taken from Local Media Assets iirc, as there it is the same as with my case - no need for probabilities when trying to match, as I always have the correct unique ID in my filenames.

So my questions are basically:
a] How do I access the object currently being updated with meta-data. (something like media.episode.season to get the season number of the current episode)
b] How do I set this object's meta-data.


Am I even correct when assuming that for each file the update-method is called once and that's pretty much it? Because when thinking about it, I wondered at what point of that process the meta-data for a season would be delivered... I seem to be missing something. ;)


You are correct, the scanners use a stripped down Python runtime which doesn't have the full Framework available to it.



The only two methods an agent needs to implement are the two you've mentioned.



The media.XXX stuff is the media information, where you access to filenames, etc.:


<br />
      for s in media.seasons:<br />
        for e in media.seasons[s].episodes:<br />
          for i in media.seasons[s].episodes[e].items:<br />
            for part in i.parts:<br />
              XXX<br />




and the metadata.XXX stuff is the metadata model you write to.


<br />
    # Show.<br />
    metadata.title = 'Series Name'<br />
    metadata.summary = 'Summary'<br />
<br />
    # Season.<br />
    metadata.seasons[season_num].summary = 'Season summary'<br />
<br />
    # Episode.<br />
    metadata.seasons[season_num].episodes[episode_num].title = 'The First Episode'<br />
<br />






Via the metadata hierarchy, as shown above.



The update() function is called once per movie, show, artist, or album.

Brilliant, that clears lots of things up, thank you very much.



I’ve got yet another problem/question, though. This code (I checked if it compiles, so no indentation problems etc.) for testing purposes…



import re<br />
<br />
def Start():<br />
  HTTP.CacheTime = CACHE_1HOUR #* 24<br />
<br />
class TheTestAgent(Agent.TV_Shows):<br />
  name = 'The Test Agent'<br />
  languages = [Locale.Language.English]<br />
  primary_provider = True<br />
  <br />
  def search(self, results, media, lang):<br />
    print 'test'<br />
    results.Append(MetadataSearchResult(id = null, score = 100))<br />
          <br />
  def update(self, metadata, media, lang):<br />
    print 'test2'



Now when I run
**~/Library/Application\ Support/Plex/Plex\ Media\ Server.app/Contents/MacOS/Plex\ Media\ Scanner --force --refresh --section 12**
with 12 of course being the ID of my section where I selected the agent from the sample code above, shouldn't I see test or test2 in the output that follows?

At first I tried having
* metadata.seasons[1].episodes[1].title = 'The First Episode'*
as the only line in update(), which did not seem to do anything, despite my test file being S01E01, so I tried to find out what/if any methods are triggered at all.

Still haven’t gotten further than described in my last posting… I’m also happy getting help from people other than elan. :wink:

It seems to me there is just some minor detail I’m not yet getting right, but I can’t get it to work.



That should be be enough, assuming that the media in that section is "matched" to that agent.

If you scan from scratch, you should see "test" and "test2" and if you simply force a refresh, you should see "test2".

You need to be running PMS from the Terminal to see this output, of course, since "print" doesn't go to the log file.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.