RegEx

okay. so I have exposed the rtmpe mp4’s for tvland.com, however I need RegEx to get it to work. the website i am trying to get is



http://www.tvland.com/prime/fullepisodes/everybody_loves_raymond/



and my regex code looks like this



def eplist(sender, pageUrl):<br />
    dir = MediaContainer(title2=sender.itemTitle)<br />
    content = XML.ElementFromURL(pageUrl, True)<br />
    page = HTTP.Request(pageUrl)<br />
    for item in page:<br />
      titleUrl=re.compile('index.jhtml?episodeId="(.+?)",').findall(item)<br />
      Log(titleUrl)<br />
      thumb=re.compile('thumbNail: "(.+?)",').findall(item)<br />
      title=re.compile('title: "(.+?)",').findall(item)<br />
      dir.Append(WebVideoItem(titleUrl, title=title, thumb=thumb))<br />
    return dir



Any regex wizards out there want to give me a hint as to why none of my regex variables have any values? Once this is done Plex will have great syndication episodes.

Hey! 2 things to start with:

[list=1]

[]You can delete the line with content = XML.ElementFromURL(pageUrl, True) as it’s not used anywhere else;

[
]The for-loop can’t work because the variable item doesn’t exist.

[/list]

Needs to be tested, but I think this is what you’re looking for:



<br />
import re<br />
...<br />
...<br />
...<br />
def eplist(sender, pageUrl):<br />
    dir = MediaContainer(title2=sender.itemTitle)<br />
    page = HTTP.Request(pageUrl)<br />
<br />
    episodes = re.compile('{url:.+?}', re.DOTALL).findall(page)<br />
    for ep in episodes:<br />
        episodeId = re.search('index.jhtml\?episodeId=([0-9]+)",').group(1) # returns the episode id, for example: '26141'<br />
        thumb = re.search('thumbNail: "(.+?)",').group(1) # returns: '/images/etcetc.jpg'<br />
        title = re.search('title: "(.+?)",').group(1) # returns the title, for example: 'The Break Up Tape'<br />
        dir.Append(WebVideoItem(episodeId, title=title, thumb=thumb))<br />
<br />
    return dir<br />




The titleUrl and thumb need some sort of base url attached to them, as the values returned from the regexes do not include a domain/path.


<br />
base = re.search('meta name="URL" content="(.+?)"').group(1) # returns: 'http://www.tvland.com/prime/fullepisodes/everybody_loves_raymond/'<br />
...<br />
...<br />
...<br />
dir.Append(WebVideoItem(base + 'index.jhtml?episodeId=' + episodeId, title=title, thumb='http://www.tvland.com' + thumb))<br />



I'm no wizard, but this is what I noticed:
You need to escape the "literal ?" as "\?" AND you have an unneeded quotation mark.
Like this:


titleUrl=re.compile('index.jhtml\?episodeId=(.+?)",').findall(item)



I would just try getting the "url" the same way you get the "title" and the "thumb" though.

<br />
<br />
      titleUrl=re.compile('url: "(.+?)",').findall(item)<br />
      Log(titleUrl)<br />
      thumb=re.compile('thumbNail: "(.+?)",').findall(item)<br />
      title=re.compile('title: "(.+?)",').findall(item)




Then you can use string manipulation if you only need the vid#.

If you are still having issues with Regex, I have a TVLand plugin which is currently working in a beta version. I began working on the plugin back in March but have not had any time recently to continue it. The progess of my plugin thus far allows for the full episodes to be played, I haven’t however worked on getting any of the clips to play. When I get home this afternoon I will upload the progress that I have made and maybe you can incorporate into yours to continue with development. I just don’t have the time.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.