how to pass html/xml fragment

NonCerealizableObjectError
I want to pass a fragment from an html page, that I created with XML.ElementFromURL and then did an xpath on it (so it's an lxml.html.HtmlElement) , as an argument of a callback function wrapped in a DirectoryItem object and I get an NonCerealizableObjectError.


<br />
def MainMenu():<br />
	dir = MediaContainer(viewGroup="InfoList")<br />
<br />
	htmlPage = XML.ElementFromURL(URL, isHTML=True)<br />
	menuItems = htmlPage.xpath('//ul[@id="menu"]/li')<br />
<br />
	for menuItem in menuItems:<br />
		category = getCategoryDetails(menuItem)<br />
		dir.Append(Function(category, item=menuItem, subPageURL=subPageURL))<br />
	return dir<br />



There is obviously no problem passing menuItem (the lxml.html.HtmlElement) to getCategoryDetails, a function I wrote and which returns a DirectoryItem object, but the callback function wrapped into this DirectoryItem does not like the menuItem as a parameter:

<br />
NonCerealizableObjectError: Object of class/type '<class 'lxml.html.HtmlElement'>' cannot be cerealized! Use cerealizer.register to extend Cerealizer support to other classes.<br />



I am fairly new to Python and the Plex plugin framework (but not new to object oriented programming) and will need some guidance here of how I could pass an lxml.html.HtmlElement object and/or how to 'cerealize' it.
TIA and Cheers,
alex

dir.Append(Function(category, item=menuItem, )subPageURL=subPageURL))



You need to move subPaheUrl ourside of Function() as it does not have any subPageUrl item. This will allow for the next special function call() to be



def catagory(sender, subPageUrl)



Harley,
Thanks, I think you are pointing me into the right direction. I am travelling for business with my Windoze machine right now and do not have my MBP with the code with me and cannot try out your suggestions.
'category' is an object (a DirectoryItem one) returned by the getCategoryDetail function, something like:

<br />
def getCategoryDetail(menuItem)<br />
	category = DirectoryItem(SubMenu, None)<br />
	category.title = menuItem.xpath('some-xpath-string')<br />
	category.subtitle = menuItem.xpath('some-other-xpath-string')<br />
	...<br />
	return category<br />



So I think the dir.Append should more look like:

<br />
dir.Append(Function(category), item=menuItem, subPageURL=subPageURL))<br />



and the DirectoryItem callback (SubMenu in my case) more like

<br />
def SubMenu(sender, item, subPageURL):<br />



do you agree?
I understand that my main mistake was that I forgot to set the correct bracketing around the DirectoryItem, but if subPageURL belongs outside the brackets, so does menuItem, correct?
TIA and Cheers,
alex



Hello Alex!

I’m not exactly sure what you are trying to do, but I think you need to rethink the structure of building the menus. Unless you need to work with variable function names, this is how I think your code has to look like more or less:



URL = 'http://www.example.com'<br />
...<br />
...<br />
<br />
def MainMenu():<br />
        dir = MediaContainer(viewGroup="InfoList")<br />
<br />
        htmlPage = XML.ElementFromURL(URL, isHTML=True)<br />
        menuItems = htmlPage.xpath('//ul[@id="menu"]/li')<br />
<br />
        for menuItem in menuItems:<br />
                title = menuItem.xpath('./a')[0].text<br />
                subtitle = menuItem.xpath('./p')[0].text<br />
                subpage_url = menuItem.xpath('./a')[0].get('href')<br />
<br />
                dir.Append(Function(DirectoryItem(SubMenu, title=title, subtitle=subtitle), url=subpage_url))<br />
<br />
        return dir<br />
<br />
<br />
def SubMenu(sender, url):<br />
        dir = MediaContainer(viewGroup="InfoList", title2=sender.itemTitle)<br />
<br />
        anotherHtmlPage = XML.ElementFromURL(url, isHTML=True)<br />
        ...<br />
        ...<br />



Thanks sander1,
Regardless of me rethinking the structure of building my menus (the reason why I picking out the title and subtitle in a separate function is to reuse it from other menus (and I do not think the way I am doing *_this part_* is the problem ...)), you seem to have missed that my main issue is how to pass the lxml.html.HtmlElement menuItem (hence the name of this topic).
I still did not had a chance to test it (still travelling ...) but I am pretty positive that Harley's tip (pointing out that I have my parentheses set incorrectly ...)will resolve that for me.
Thanks again and Cheers,
alex

check out my abc.com half plugin. it performs what you are asking about, I think. Video’s don’t work in this plugin, fair warning. Its the apps issue, not yours.

[size=“2”][size=“2”]


Harley,
I looked at your example, but although it has a lot of xpath'ed html elements you never pass any of those, specifically not to a PMS callback function.

Anyways - i took all the unrelated crap out of my code to focus on the passing-an-xml-fragment issue and it looks like this now:[/size][/size]
[size="2"][size="2"]

URL = "http://www.something.com"<br />
def MainMenu():<br />
	dir = MediaContainer(viewGroup="InfoList")<br />
<br />
	htmlPage = XML.ElementFromURL(URL, isHTML=True)<br />
	menuItems = htmlPage.xpath('//ul[@id="menu"]/li')<br />
<br />
    for menuItem in menuItems:<br />
        title = menuItem.xpath('./a/span')[0].text<br />
        subtitle = menuItem.xpath('./a')[0].get("title")<br />
        subPageURL = URL+menuItem.xpath('./a')[0].get("href")   	<br />
        dir.Append(Function(DirectoryItem(SubMenu, title=title, subtitle = subtitle), item=menuItem, subPageURL=subPageURL))<br />
<br />
def SubMenu(sender, item, subPageURL)<br />
    ...<br />



and I still get the NonCerealizableObjectError:

NonCerealizableObjectError: Object of class/type '<class 'lxml.html.HtmlElement'>' cannot be cerealized! Use cerealizer.register to extend Cerealizer support to other classes.<br />



I have an [ugly] workaround by globalizing the parameter[s] that I would need to pass, but I'd rather pass them directly.[/size][/size]
[size="2"][size="2"]Any help appreciated.[/size][/size]
[size="2"][size="2"]TIA and Cheers,
alex
[/size][/size]

Is there a specific site you are looking at, that you could share?

I know it helps me wrap my head around it, when I can see the whole picture.







Can you store the (title, subtitle, & url) in a tuple and pass the tuple?


I am obviously completely unable to explain myself ...
It is _not_ about passing title, subtitle and url, but about the XML fragment I created from the web page with xpath. Other than title, subtitle and url this element (it is called menuItem in my example above) is not a string, but an instance of an lxml.html.HtmlElement object (which I assume is the issue).
The 'menuItem' html fragments contains html elements with title, description and potentially submenu info and then eventually video item/stream URLs and description of the video file. depending on the menu level the plugin is on I need certain data. I do not want to scrape all video data already on a higher level already (as it's possibly never used and needed) and I do not want to go through the for [menuItem in menuItems] loop on every level.

I can not only share the url of the web site, but the whole plugin. It's in the 'Unsupported Plugins' section on the Wiki and you can get it from [here](http://wiki.plexapp.com/images/b/b1/ORFtvthek.bundle.0.6.zip).
[It's the on demand web site of the Austrian TV, so everything on the web site is in German ...]

There must be easier ways to code this than I did - I am new to Python and maybe things need to be done completely different than I think they should ...
Any help highly appreciated.

I understand what you are trying to do, it’s just that I think that the “default” Plex functions don’t offer this out of the box. But… you can try it like this:



<br />
...<br />
...<br />
from lxml import etree<br />
<br />
<br />
        ...<br />
        ...<br />
        for menuItem in menuItems:<br />
                Log( etree.tostring(menuItem) )<br />
                ...<br />
                ...<br />



Aah, that looks promising - is there also a function to de-serialize it on the other side though?
Or, maybe that's not even necessary thinking about it - I can do an xpath on a string, can I?




Yup,

something = XML.ElementFromString(input, isHTML=True).xpath('//a[@class="example"]')[0].text


should do it :rolleyes:

sander1,
That sounded like a good idea at first - unfortunately this seems to kill the PMS with 'Segmentation fault'.
When I restarted the PMS and tried again (running the plugin with above statement, that is ...), I get the same error followed by ': 'NoneType' object is not callable' which makes it seem to me that something (most likely the lxml.etree package ...) is missing. So I am assuming that I will have to install it - I just wanted to be careful to not override stuff, that is used by PMS; PMS obviously uses lxml.html.HtmlElement.
Again - I am very new with Python and have no idea how the internals in regards to packages work.
Any hints appreciated.
TIA and Cheers,
alex

I just noticed, that the XML class also has methods like ‘StringFromElement’ and this is obviously using the lxml.etree methods that you mention.

I will give that a swirl and report back …

I do not get this to work and need some help here.

I would be tremendously grateful, if somebody could run the attached plugin and tell me what I am doing wrong.

Start the ORF TVThek plugin and select the any of the first four menu options (don’t worry about the German …) - this will kill the PMS with ‘Segmentation fault’.



In the function CategoryMenu I am trying to serialize the HtmlElement categoryItem to a string with



strCategoryItem = XML.StringFromElement(categoryItem, encoding="utf8", method="html")


I tried Sander's etree.tostring idea but it has the same effect (actually I think that XML.StringFromElement is using lxml.etree.tostring anyways ...). I also tried StringFromElement without the encoding and method parameter and I also tried method="xml".

In the function ShowsInCategory I try to make the serialized html fragment an HtmlElement again with

categoryItem = XML.ElementFromString(item)



The actual error (the PMS dies with a Segmentation fault) seems to appear when the callback function passing the serialized element (the str output from the XML.StringFromElement) is called. I do not have any other imports than the PMS ones (PMS, PMS.Objects, PMS.Shortcuts).
Any help appreciated.

I upgraded my bundle now using v2 of the PMS framework - same Segmentation fault.

I attached the changed plugin as well as the last plugin log.

Any help and hint&tips what I am possibly doing wrong and/or what I could and should do different (and better) highly appreciated.

Thanks!

alex

Hey, don’t know if you have a solution for your problem but I ran into something similar. It seemed to me that XML.StringFromelement did not serialize correctly some HTML snippet I had (I think it was the case that the element had several children at the top (several roots?) and StringFromElement only converted the top one and eventually got confused).



I ended up doing something like this: (pseudo code from the top of my head, don’t copy and paste …)



<br />
el = xml.ElementFromURL(blabla,True).xpath(....)<br />
<br />
serialized = ''<br />
for e in el.iterchildren() :<br />
  serialized = serialized + XML.StringFromElement(e).strip()<br />
<br />
Myfunc(....., SubHTML = str(serialized))<br />
... <br />
<br />
def Myfunc (SubHTML=None)<br />
  passedElement = XML.ElementFromString(SubHTML,True)<br />
  ....<br />


After picking this up again (after almost 3 years, I know ...) I ran into the same issue.

The good thing -  the HTML.StringFromElement and HTML.ElementFromString methods work just fine now.

Bottom line: you still cannot pass an lxml.html object, but if you serialize it as a string (and then de-serialize it in the receiving function) with the above methods, you can pass a parsed HTML/XML document or fragment as a string..

Cheers,

alex

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.