Hints for first plugin

Unfortunately, this code generates the same error, so I think I must be doing something wrong:







rtmpClip = content2.xpath(’//div[@class=“jw-player”]//param[@name=“flashvars”]’)[0].get(‘value’)

File “/Users/flubr/Library/Application Support/Plex Media Server/Plug-ins/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/bases.py”, line 142, in

getitem = lambda x, y: x.getitem(y),

IndexError: list index out of range



<br />
def EenVideos(sender, pageUrl):<br />
    dir = MediaContainer(title2=sender.itemTitle)<br />
    content = HTML.ElementFromURL(pageUrl, errors='ignore')<br />
    for video in content.xpath('//div[@id="videoArea"]/div/ul/li'):<br />
        image = video.xpath("a/img")[0].get('src')<br />
        title = video.xpath("h5/a")[0].text<br />
        title = title.split(" - ")[1]<br />
        link = video.xpath("h5/a")[0].get('href')<br />
        link = ROOT_URL + link<br />
        Log(link)<br />
           	<br />
        content2 = HTML.ElementFromURL(link)<br />
        summary = content2.xpath("//div[@id='videoZoneContainer']/div/p/a")[0].text<br />
        rtmpClip = content2.xpath('//div[@class="jw-player"]//param[@name="flashvars"]')[0].get('value')<br />
        Log(rtmpClip)<br />
<br />
        dir.Append(RTMPVideoItem(url="rtmp://vrt.flash.streampower.be/een", clip="2011/02/ogom_110216_huisdokter_Website_EEN", width=640, height=360, live=False, title=title, summary=summary, thumb=image))<br />
<br />
    return dir<br />



with e.g.:
pageUrl = [http://www.een.be/mediatheek/tag/168](http://www.een.be/mediatheek/tag/168)
link = [http://www.een.be/mediatheek/405910](http://www.een.be/mediatheek/405910)

My firefox xpath tools still gives this:



[size=2]So it seems that the xpath in plex doesn't return anything? How can this be?[/size]

it could be the page is modified by javascript

try doing a Log(HTTP.Request(pageUrl).content) at the beginning of your function and see if the HTML code that plex sees is the same as what a browser would see. Not entirely sure the javascript snippets are executed in Plex…

Thanks, this solved my issue!

I was able to extract the RTMP clip from the java script. In some cases I get redirected to the wrong video, but I think this is an error in the site’s source code since these wrong redirections also occur in my browser. I sent an email to the webmaster and hope it will be resolved :slight_smile:



BTW, in the meanwhile I adapted the “deredactie.be” (news) plugin from Sander. I made the “Sporza” (sports news) plugin from it and submitted it for the app store. It’s also on the “unsupported plugins” page.

Hi,

I’m having some trouble adding some text as a summary. I tried this command for the HTML code below:


<br />
summary = inhoud.xpath('//div[@class="text"]/p')[0].text<br />




<br />
<div class="text"><br />
<p><strong>Tomas liet zich vandaag inspireren door de indrukwekkende speech van Khaddafi om al zijn eisen duidelijk te maken aan het volk.</strong></p><br />
<p>Het staat vast dat kolonel Tomas niet zal wijken voor opstandige burgers en dat er erg repressieve maatregelen zullen volgen wanneer zijn wensen niet worden vervuld.&nbsp;</p><br />
 </div><br />




However the "strong" part causes some trouble I think. The "strong" parts only make the text bold. This isn't necessary in plex, so I was wondering if anyone knew a way to just extract all the text. Sometimes one word in the middle of the text is made "strong", so the structure isn't fixed.
Thanks in advance!

try .text_content() instead of .text



Thanks, it worked!

nice!

I wrote this up for myself else but is probably worth putting in a forum post.



Three ways to extract text from an XML tree all with slightly different semantics.



1.


e.xpath("a")[0].text

calls text method on lxml.etree_Element (http://lxml.de/api/lxml.etree._Element-class.html#text)



2.


e.xpath("a")[0].text_content()

calls text_content on HtmlMixin (http://lxml.de/api/lxml.html.HtmlMixin-class.html)



3.


e.xpath("a/text()")[0]

uses xpath to extract the text



Jonny

Thanks, thats a clear summary.



The plug-in I’m writing right now is working almost completely, but I still have 2 questions:


  1. Is it possible to ignore this kind of error:



    rtmpClip = inhoud.xpath(’//div[@class=“embed”]/script[@type=“text/javascript”]’)[0].text

    File “/Users/flubr/Library/Application Support/Plex Media Server/Plug-ins/Framework.bundle/Contents/Resources/Versions/2/Python/Framework/bases.py”, line 142, in

    getitem = lambda x, y: x.getitem(y),

    IndexError: list index out of range

    Normally I should get a list of videos. The rtmpclips are being fetched from some pages. But sometimes one of these pages doesn’t contain a video (error in the website construction), which results in the error message above. The result is that none of the videos are displayed in the list (e.g. if 14 videos have been fetched correctly but one failed because there’s an error on the site, none of the 15 videos are displayed in the list)


  2. The plugin is quite slow since there is a lot of data to be fetched. I can extract a list with URLs. These URLs redirect me to new pages that contain the rtmpClips. So this means that if I want to list of 15 videos, Plex has to go through 15 pages to fetch all the rtmpclips. Is there a way to do this faster? e.g. by fetching the rtmpClip when you open a video from the list?



    Here’s my code:

<br />
import datetime, re, pickle<br />
<br />
PLUGIN_PREFIX   = "/video/stubru"<br />
MEDIA      = "http://www.stubru.be/media/results/taxonomy_F_44"<br />
ROOT_URL		= "http://www.stubru.be"<br />
####################################################################################################<br />
def Start():<br />
  Plugin.AddPrefixHandler(PLUGIN_PREFIX, MainMenu, "Stubru", "icon-default.png", "art-default.jpg")<br />
  Plugin.AddViewGroup("Details", viewMode="InfoList", mediaType="items")<br />
  MediaContainer.art = R('art-default.jpg')<br />
  MediaContainer.title1 ="Stubru"<br />
  DirectoryItem.thumb=R("icon-default.png")<br />
  <br />
<br />
#####################################  <br />
def MainMenu():<br />
    dir = MediaContainer()<br />
    dir.Append(Function(DirectoryItem(Videos, title="Meest recent"), pageUrl = MEDIA))<br />
    for item in HTML.ElementFromURL(MEDIA).xpath('//div[@id="sidebar-left-inner"]/div/div/div/div/div[2]/div/ul/li/span/a'):<br />
    	title = item.text<br />
    	link = ROOT_URL + item.get('href')<br />
    	dir.Append(Function(DirectoryItem(Videos, title=title), pageUrl = link))<br />
    return dir<br />
<br />
#####################################<br />
def Videos(sender, pageUrl):<br />
    dir = MediaContainer(title2=sender.itemTitle)<br />
    content = HTML.ElementFromURL(pageUrl, errors='ignore')<br />
    for video in content.xpath('//div[@id="content-middle"]//ul/li/div//div[@class="content-inner"]'):<br />
        image = video.xpath('div/div/div/div[@class="field-item odd"]/a/img')[0].get('src')<br />
        title = video.xpath("div[2]/h2/a")[0].text<br />
        subtitle = video.xpath('div[2]/div[@class="info"]/span')[0].text<br />
        link = ROOT_URL + video.xpath("div[2]/h2/a")[0].get('href')<br />
#        Log(HTTP.Request(link).content)  <br />
        <br />
        inhoud = HTML.ElementFromURL(link)<br />
        summary = inhoud.xpath('//div[@class="text"]/p')[0].text_content()<br />
        rtmpClip = inhoud.xpath('//div[@class="embed"]/script[@type="text/javascript"]')[0].text<br />
        rtmpClip = rtmpClip.split('.flv"')[0]<br />
        rtmpClip = rtmpClip.split('file: "')[1]<br />
#        Log(rtmpClip)<br />
        <br />
        dir.Append(RTMPVideoItem(url="rtmp://vrt.flash.streampower.be:80/stubru/", clip=rtmpClip, live=False, title=title, subtitle=subtitle, summary=summary, thumb=image))<br />
<br />
    return dir<br />


It the loading o the images that certainly takes a long time.



I usually use a separate function and in the ermpvideo item do:



thumb = Function(getThumb,image)



The getThumb is as follows:



def getThumb(url);

Try:

return DataObject(HTTP.Request(url).content ,'image/jpeg)

except:

return R(icon)

For the first problem, check the length of the xpath return before using it



<br />
result = inhoud.xpath('//div[@class="embed"]/script[@type="text/javascript"]')<br />
if len(result) > 0:<br />
   rtmpClip = result[0].text <br />




For the second, you need to use a video redirect function. Logically what happens is that instead of extracting the rtmp clip URL in the original page you delegate that to a function that gets called when the video item is selected. The PBS plugin uses this approach

https://github.com/jonnywray/PBS.bundle/blob/master/Contents/Code/__init__.py

See line 138 for how to use it and 151 on for the function that is called (obviously you'll have to replace the logic in this function by your own to extract the rtmp clip URL).

Jonny

@pierre: Thanks for your suggestion, but I don’t think this is the problem since it’s also slow when I don’t load any images. The problem is that every time the loop is being executed, another URL is being opened.



@Jonny Wray: Thanks, your solution for the first question seemed to solve the issue. I’ll try to implement your second solution when I’ve got time. I’ll let you know if it worked (but I suppose it will).

Jonny is the man !



Sorry, I just reread my message from earlier and it does not make any sense at all !! I was trying to type it on my iphone with autocorrect on, trying to conceal it from the wife who is starting to hate the fact that I spend so much time on the forum !!!



Maybe we should write your wife a plugin to keep her busy while you're spending time on the forum helping us :D

well I’ve done that already … but she doesn’t watch that much TV … Plus I was really being an ass, replying to the forum at the dinner table …

Well I usually get something along the lines of ‘you talking to your Plex boyfriends again’.



flubr: the solution I posted is exactly to solve that problem of hitting another URL within a loop. It’s a common issue so there’s a general solution. Shouldn’t be too hard, just move your rtmp clip extraction code to another function.



Jonny

This way the code gets executed when your try to play the video rather than when the list is build. Same idea with the thumbnails.

I’ve actually wondered about that but keep forgetting to ask. With thumbnails you do actually need the URL content (the image) in the list, so what does using the Function approach do? Load them in parallel in an asynchronous manner, or does it load when you visit that list item (but then you need multiple images in a list)?



I’ve actually never used this approach for thumbs in my plugins so have never quite understood what it actually does.


In most cases it’s faster and makes sense to find/build the final url to a video in a “PlayVideo” function that gets executed when a user clicks a certain item. Usually this is done for speed, as the opening of additional pages is only done when it’s needed.

However, by looking at this piece of code that was posted above, it doesn’t matter when the final video url is build, because the HTML page that is used for the summary also contains the video info:



for video in content.xpath('//div[@id="content-middle"]//ul/li/div//div[@class="content-inner"]'):<br />
        image = video.xpath('div/div/div/div[@class="field-item odd"]/a/img')[0].get('src')<br />
        title = video.xpath("div[2]/h2/a")[0].text<br />
        subtitle = video.xpath('div[2]/div[@class="info"]/span')[0].text<br />
        link = ROOT_URL + video.xpath("div[2]/h2/a")[0].get('href')<br />
#        Log(HTTP.Request(link).content)  <br />
        <br />
        inhoud = HTML.ElementFromURL(link) ### <-- Open the webpage containing the info we want,...<br />
        summary = inhoud.xpath('//div[@class="text"]/p')[0].text_content() ### <-- ...extract the summary text...<br />
        rtmpClip = inhoud.xpath('//div[@class="embed"]/script[@type="text/javascript"]')[0].text ### <-- ...and lets grab the video info too while we are here<br />
        rtmpClip = rtmpClip.split('.flv"')[0]<br />
        rtmpClip = rtmpClip.split('file: "')[1]<br />
#        Log(rtmpClip)<br />