If you have not already, we suggest setting your Plex username to something else rather than email which is displayed on your posts in forum. You can change the username at https://app.plex.tv/desktop#!/account
Welcome to our forums! Please take a few moments to read through our Community Guidelines (also conveniently linked in the header at the top of each page). There, you'll find guidelines on conduct, tips on getting the help you may be searching for, and more!

Cache interaction question

jwrayjwray Posts: 1,268Members, Plex Pass Plex Pass
Hi,

I'm scraping some icon jpegs for the freecaster plugin I'm working on. The problem is the icons are on the destination pages and so I have to scrap 10 different pages to extract the 10 icons. While this works it is slow. Now I know these icons are cached after the first time they used so is there a way for me to determine that in my code to avoid scraping the child pages every time that section is accessed?

My current code is:

dir = MediaContainer(title2=breadcrumb)
for item in XML.ElementFromURL(FREECASTER_URL,True, encoding="iso-8859-1").xpath('//div[@id="navigation_menu"]/ul/li//a'):
title = item.get('title').replace('channel','')
url = item.get('href')
if(url != '/live'):
thumb = None
thumbElement = XML.ElementFromURL(FREECASTER_URL + url, True, encoding="iso-8859-1").xpath('//link[@rel="image_src"]')
if(thumbElement):
thumb = thumbElement[0].get('href')

dir.Append(Function(DirectoryItem(SortedChannel, title=title, summary=None, thumb=thumb), breadcrumb=title, url=url))
return dir

what I'm trying to avoid is repeating the XML.ElementFromURL(FREECASTER_URL + url, True, encoding="iso-8859-1").xpath('//link[@rel="image_src"]') once the images have been cached.

Maybe creating the DirectoryItem without the thumb, checking to see if the thumb is set and scraping if not? Not sure of the code needed to do that.

thanks
Jonny

Comments

  • jamjam Plex Dev Team Posts: 4,303Members, Plex Employee, Plex Pass, Plex Ninja Plex Employee
    The easiest way is to use the cacheTime argument in XML.ElementFromURL(). Just set it to something really high & the page will be fetched from the HTTP cache instead of being downloaded every time.

    One thing you can do is implement the UpdateCache() method in your plug-in. This gets called periodically by the framework (and immediately after your plug-in starts). You can use this to go out & fetch information that would usually take a long time to gather, then store them somewhere. You can cache pages by simply by calling HTTP.Request() for each page in UpdateCache(), or you can do something a little more complicated and split the logic up a bit - use UpdateCache() to fetch the thumb URLs & store them in the plug-in's dictionary, then fetch the URLs back from the dictionary when responding to requests. It takes a bit more effort to implement, but when it's done it makes browsing within plug-ins pretty instantaneous.
  • jwrayjwray Posts: 1,268Members, Plex Pass Plex Pass
    Thanks Jam, that helps a lot. I'm going back and forth on actually using the thumbs from the site (they are not really a good size) but from you explanation I can already see other places where caching will help.
Sign In or Register to comment.