Scraping media URL from website

myPlex does it, why can't I?
I'm having some issues scraping the asset URL from a website.

However, if I use the 'Plex It!' bookmarklet, it scrapes the site fine and finds the video.

Is there some way to see either a) the steps the bookmarklet takes, or B) the actual final URL the bookmarklet arrives at so I can mimic the functionality in my plugin?

FYI, the site in question is: http://penny-arcade.com/patv/episode/weddings

The real beauty of the system is that you don’t need to. If you’re using the 2.1 version of the plugin framework, as described on dev.plexapp.com/docs, the code that handles the bookmarklet will also handle the videos for your plugin. You just need to write the plugin code to handle navigation.



I've been poking through the documentation, and I guess I'm confused as to what I actually need to give the PMS to serve up the content.

I have a menu system set up where it lists out all the shows, then their seasons, then the individual episodes. Right now I have the episode information stored in EpisodeObjects, but I'm unsure of how to proceed. Every URL I've tried to assign to the EpisodeObject.url parameter hasn't worked, and even when I fed it a straight up .mp4 url Plex didn't like it.

How do I tell Plex "find the video on this page"? Is that what site configurations are for?

I could be mistaken. It sounds to me that the video for the page is being found by the generic Fallback URL Service. I thought that it should match it for your plugin as well. If the pages include the direct URL for the mp4, then all you should need is a simple URL service that matches URLs for the site you’re working on and returns the mp4 URL. There’s some good tutorial type blog posts available on the dev.plexapp.com site.

That makes sense to me – the fallback URL service should just do all the magic. However, here’s what I get:


2012-07-01 10:55:05,763 (26dc) :  DEBUG (core:337) - No service found for URL 'http://feeds.gawker.com/~r/kotaku/full/~3/2EtRMj7UFH8/angry-birds-observe-easter-too-but-its-weird'<br />
2012-07-01 10:55:05,763 (26dc) :  DEBUG (core:337) - No matching services found for 'http://feeds.gawker.com/~r/kotaku/full/~3/2EtRMj7UFH8/angry-birds-observe-easter-too-but-its-weird'<br />
2012-07-01 10:55:05,765 (26dc) :  DEBUG (servicekit:145) - There are 1 fallback services<br />
2012-07-01 10:55:05,765 (26dc) :  DEBUG (core:337) - Loading service code for Fallback (URLServiceRecord)<br />
2012-07-01 10:55:05,852 (26dc) :  DEBUG (prefskit:163) - Loading prefs for <Framework.policies.servicepolicy.ServicePolicy object at 0x051CB430><br />
2012-07-01 10:55:05,854 (26dc) :  INFO (core:337) - No user preferences file exists<br />
2012-07-01 10:55:05,858 (26dc) :  CRITICAL (core:337) - Function named 'NormalizeURL' couldn't be found in the current environment<br />
2012-07-01 10:55:05,858 (26dc) :  WARNING (core:337) - Unable to normalize URL 'http://feeds.gawker.com/~r/kotaku/full/~3/2EtRMj7UFH8/angry-birds-observe-easter-too-but-its-weird'<br />
2012-07-01 10:55:05,861 (26dc) :  DEBUG (core:337) - No service found for URL 'http://feeds.gawker.com/~r/kotaku/full/~3/2EtRMj7UFH8/angry-birds-observe-easter-too-but-its-weird'<br />
2012-07-01 10:55:05,861 (26dc) :  CRITICAL (core:337) - Exception when constructing response (most recent call last):<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\components\runtime.py", line 874, in construct_response<br />
    resultStr = self._core.data.xml.to_string(result._to_xml(context=context))<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\objectkit.py", line 310, in _to_xml<br />
    if urlservice._media_objects_function_for_url_is_deferred(url):<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 87, in _media_objects_function_for_url_is_deferred<br />
    return self._media_objects_function_for_service_is_deferred(service)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 90, in _media_objects_function_for_service_is_deferred<br />
    return self._function_for_service_is_deferred(service, 'MediaObjectsForURL')<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 93, in _function_for_service_is_deferred<br />
    ret = self._core.services._function_in_service_is_deferred(f_name, service, context=self._context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\core.py", line 1021, in _function_in_service_is_deferred<br />
    return service.host.function_is_deferred(fname, self._create_service_context(service.host, context))<br />
AttributeError: 'NoneType' object has no attribute 'host'<br />
<br />
2012-07-01 10:55:05,861 (26dc) :  DEBUG (core:337) - Unable to handle response type: <class 'Framework.modelling.objects.VideoClip'><br />
2012-07-01 10:55:05,865 (26dc) :  DEBUG (core:337) - Response: 500


It looks like it is trying to call the NormalizeURL function of the Fallback service, and fails because the Fallback service doesn't have a NormalizeURL function! What's weird is that the documentation states that this function is optional...

The error in the log in regards to being unable to normalize the url is not a problem, just a notification. The NormalizeURL() function is optional. The error that you’re seeing from the Fallback TestURLs is because those test urls are more of a goal that one of the other channel devs is trying to accomplish than examples of successful sites (Not exactly the idea for test urls in general but that’s neither here nor there at the moment). After a quick look at the webpage’s source, it seems that the Fallback url is grabbing the url for the video hosted on blip.tv from the iframe. The fallback service works fine to match the url and grab the video and metadata on my system with a small test plugin I have for just such occassions.

I have to assume at this point that there’s an error in your code somewhere. If you copy the relevant function into a code-block here, I’d be happy to help you track it down.



Re: Your previous question about site configurations… They are used (only when unavoidable) to lock onto a flash/silverlight video in a virtual browser window when it’s not possible to grab the actual video file (mp4, flv, or otherwise).

Thanks for your help, here’s the code:



def episode_menu(season_dict):<br />
  ' Constructs the episode menu for a single season '<br />
  oc = ObjectContainer(view_group = 'List', title1 = season_dict['title'])<br />
<br />
  # Rather than load each page here, can we just return some kind of a callback that<br />
  # will get the page when we go to play the video???  Probably if we set the key and ratings_key...<br />
<br />
  for episode_dict in season_dict['episodes']:<br />
<br />
    oc.add(EpisodeObject(url = episode_dict['url'],<br />
                         title = episode_dict['title'],<br />
                         thumb = episode_dict['thumb']))<br />
<br />
  return oc



The full code is here: https://github.com/spamminator/patv.bundle

Here the URL is a website such as: http://penny-arcade.com/patv/episode/pax-2011

I'm wondering if I'm using the wrong object type? Should I instead be returning a list of VideoClipObjects? MediaObjects?

Could you share a snippet of your test plug-in as an example?

What sort of error are you getting when you try to play a video with your current plugin? I son’t see any reason that shouldn’t work. Does anything show up in the plugin log?



In regards to object type, the only real difference between EpisodeObjects and VideoClipObjects or MovieObjects is the types of metadata that can be associated with them. For example show, season, and (episode)index are all specific to EpisodeObject. From the standpoint of video-playback they are functionally the same.



I’m not at my computer ATM, so I can’t upload my test plugin. All of the official plugins are hosted publicly on github so feel free to browse them for insight (github.om/plexinc-plugins). A few fairly straightforward ones to look at for reference are Devour, GiantBomb, or Freakonomics. The last one uses TrackObjects for podcasts but the mechanics are all pretty similar.

Plug-in log:



2012-07-02 08:04:36,674 (131c) :  DEBUG (core:337) - Calling function 'episode_menu'<br />
2012-07-02 08:04:36,684 (131c) :  DEBUG (core:337) - No service found for URL 'http://penny-arcade.com/patv/episode/the-pilot-part-1'<br />
2012-07-02 08:04:36,686 (131c) :  DEBUG (core:337) - No matching services found for 'http://penny-arcade.com/patv/episode/the-pilot-part-1'<br />
2012-07-02 08:04:36,687 (131c) :  DEBUG (servicekit:145) - There are 1 fallback services<br />
2012-07-02 08:04:36,717 (131c) :  DEBUG (core:337) - Loading service code for Fallback (URLServiceRecord)<br />
2012-07-02 08:04:36,828 (131c) :  DEBUG (prefskit:163) - Loading prefs for <Framework.policies.servicepolicy.ServicePolicy object at 0x03278A90><br />
2012-07-02 08:04:36,828 (131c) :  INFO (core:337) - No user preferences file exists<br />
2012-07-02 08:04:36,834 (131c) :  CRITICAL (core:337) - Function named 'NormalizeURL' couldn't be found in the current environment<br />
2012-07-02 08:04:36,834 (131c) :  WARNING (core:337) - Unable to normalize URL 'http://penny-arcade.com/patv/episode/the-pilot-part-1'<br />
2012-07-02 08:04:36,835 (131c) :  DEBUG (core:337) - No service found for URL 'http://penny-arcade.com/patv/episode/the-pilot-part-1'<br />
2012-07-02 08:04:36,894 (131c) :  CRITICAL (core:337) - Exception when constructing response (most recent call last):<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\components\runtime.py", line 874, in construct_response<br />
    resultStr = self._core.data.xml.to_string(result._to_xml(context=context))<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\objectkit.py", line 374, in _to_xml<br />
    el = Framework.modelling.objects.ModelInterfaceObjectContainer._to_xml(self, context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\modelling\objects.py", line 351, in _to_xml<br />
    root = Container._to_xml(self, context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\modelling\objects.py", line 124, in _to_xml<br />
    self._append_children(root, self._objects, context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\modelling\objects.py", line 130, in _append_children<br />
    el = obj._to_xml(context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\objectkit.py", line 310, in _to_xml<br />
    if urlservice._media_objects_function_for_url_is_deferred(url):<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 87, in _media_objects_function_for_url_is_deferred<br />
    return self._media_objects_function_for_service_is_deferred(service)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 90, in _media_objects_function_for_service_is_deferred<br />
    return self._function_for_service_is_deferred(service, 'MediaObjectsForURL')<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\api\servicekit.py", line 93, in _function_for_service_is_deferred<br />
    ret = self._core.services._function_in_service_is_deferred(f_name, service, context=self._context)<br />
  File "C:\Users\Patryk\AppData\Local\Plex Media Server\Plug-ins\Framework.bundle\Contents\Resources\Versions\2\Python\Framework\core.py", line 1021, in _function_in_service_is_deferred<br />
    return service.host.function_is_deferred(fname, self._create_service_context(service.host, context))<br />
AttributeError: 'NoneType' object has no attribute 'host'<br />
<br />
2012-07-02 08:04:36,894 (131c) :  DEBUG (core:337) - Unable to handle response type: <class 'Framework.modelling.objects.MediaContainer'><br />
2012-07-02 08:04:36,901 (131c) :  DEBUG (core:337) - Response: 500

Could it be some missing/wrong settings in Info.plist?

I doubt that it’s an issue in the plist, but I’ll have another look at your github repo to minimize the chance of overlooking something there. Is the issue currently that the list of videos won’t load or that it loads but specific videos won’t play? Check to make sure you have the latest version of the Services.bundle (Channel Directory > More > Check For Updates).

It might be worth adding a try/except block to the for loop when adding videos to the list. Just to see if there are any that work. I honestly can’t see a reason right now why it shouldn’t be working for you.

I’ll give it a try tonight. With the current code, when I click on a season to load the episode menu, nothing happens – just the log output.


There seems to be some kind of bug in the plugin support framework. I cloned your code from github and cannot get it to work even though my test plugin still works by using a little cheat that's not intended for use in plugins. I'll try to get some help tracking down the bug.

Awesome, thanks!


Any news on the bug?


Apparently, the problem is due to the fact that the videos are hosted on Blip.tv. There's something about the way the PlexIt code interacts with Blip videos that works for the Queue but not plugins. Honestly, it's a bit of a mystery to me but, the guys with the skills to fix it are aware of the issue and it's on the todo list. That's about the best I can do atm.

Cool – thanks for your help!

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.