xpath / CDATA

dominiqueD · November 4, 2012, 7:58pm

trouble with cdata

Hi all,

I'm trying to setup a simple plugins and I've got trouble with CDATA.
Information is provided by an XMLfile, some comments are not available.


	URL="http://www.europe1.fr/podcasts/revue-de-presque.xml"<br />
	data=HTML.ElementFromURL(URL,encoding="utf-8")<br />
	item=0<br />
	for item in data.xpath('//item'):<br />
		title	= item.xpath('title')[0].text<br />
		pudDate	= item.xpath('pubdate')[0].text<br />
		url = item.xpath('enclosure')[0].get('url')<br />
		summary = str(item.xpath('summary')[0].text)<br />
		print summary<br />
	return dir

Sometime item.xpath('summary')[0].text > return "None" instead of content.
I'm not sure, but i think that is a trouble with UTF-8 (with french éàè ).

You can find the full code here: https://github.com/whoo/Cantelou.bundle.git

Have you got some clue to get the full summary inside CDATA ?
Thanks ;)

mikedm139 · November 4, 2012, 8:15pm

It’s less likely that the CDATA is the problem, than the HTML tags inside the CDATA. Try using:


<br />
summary = item.xpath('./summary/text()')<br />

It should return a list of strings which are separated by tags in the XML. You can put the string back together again using pythons .join(), like so:


<br />
summary = ''.join(item.xpath('./summary/text()'))<br />

dominiqueD · November 4, 2012, 8:42pm

Thank’s for your quick answer;

I’ve try both solution

but item.xpath(’./summary/text()’) still return nothing when there is some special char “éaè”.

sander1 · November 4, 2012, 9:47pm

Hi!

You could parse the XML file as XML and use ‘itunes’ namespace to get to the summary element. It’ll look something like this:


url = "http://www.europe1.fr/podcasts/revue-de-presque.xml"<br />
data = XML.ElementFromURL(url) # Parse as xml<br />
<br />
for item in data.xpath('//item'):<br />
    title = item.xpath('./title')[0].text.strip()<br />
    pubDate = item.xpath('./pubDate')[0].text<br />
    url = item.xpath('./enclosure')[0].get('url')<br />
<br />
    # Use the itunes namespace to grab the summary element<br />
    summary = item.xpath('./itunes:summary', namespaces={'itunes':'http://www.itunes.com/dtds/podcast-1.0.dtd'})[0].text<br />
    # Strip out HTML tags<br />
    summary = String.StripTags(summary).strip()

dominiqueD · November 4, 2012, 10:10pm

Thank’s for your answer.

It’s working fine now.

I’ve changed:


	<<data=HTML.ElementFromURL(URL,encoding=None)<br />
	>>data=XML.ElementFromURL(URL,encoding=None)

And namespace to use specials tags.


	for item in data.xpath('//item'):<br />
		summary= item.xpath('t:summary',namespaces={'t':'http://www.itunes.com/dtds/podcast-1.0.dtd'})[0].text<br />
		keyword= item.xpath('t:keywords',namespaces={'t':'http://www.itunes.com/dtds/podcast-1.0.dtd'})[0].text<br />
		pubDate=item.xpath('pubDate')[0].text<br />
		url= item.xpath('enclosure')[0].get('url')<br />
		title=item.xpath('title')[0].text.strip()<br />
		summary="[%s]

%s 
%s
keywords:%s "%(title,summary.strip(),pubDate,keyword)<br />
		title=title.strip()<br />
		dir.Append(TrackItem(url,title,"info","Rubrique",summary=summary,art=R(ICON)))<br />
	return dir

My Code is available on github:
https://github.com/whoo/Cantelou.bundle.git

mikedm139 · November 14, 2012, 10:53pm

Thank's for your answer.

It's working fine now.
I've changed:


	<<data=HTML.ElementFromURL(URL,encoding=None)<br />
	>>data=XML.ElementFromURL(URL,encoding=None)

And namespace to use specials tags.


	for item in data.xpath('//item'):<br />
		summary= item.xpath('t:summary',namespaces={'t':'http://www.itunes.com/dtds/podcast-1.0.dtd'})[0].text<br />
		keyword= item.xpath('t:keywords',namespaces={'t':'http://www.itunes.com/dtds/podcast-1.0.dtd'})[0].text<br />
		pubDate=item.xpath('pubDate')[0].text<br />
		url= item.xpath('enclosure')[0].get('url')<br />
		title=item.xpath('title')[0].text.strip()<br />
		summary="[%s]

%s 
%s
keywords:%s "%(title,summary.strip(),pubDate,keyword)<br />
		title=title.strip()<br />
		dir.Append(TrackItem(url,title,"info","Rubrique",summary=summary,art=R(ICON)))<br />
	return dir

My Code is available on github:
[https://github.com/w...elou.bundle.git](https://github.com/whoo/Cantelou.bundle.git)

If you're interested in making your channel available in the Plex Channel Directory, there are instructions [here](http://wiki.plexapp.com/index.php/App_Store_Submission) and you can file a ticket for review on the [lighthouse project.](https://plexapp.lighthouseapp.com/projects/31804-plex-plug-ins/overview)

system · December 20, 2019, 10:23pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
xpath coding Dev/API Corner plugin-dev	50	418	December 20, 2019
plex channel xml cdata styling not recognized by android app? Dev/API Corner plugin-dev	3	104	January 8, 2020
beginners question: xpath element Dev/API Corner plugin-dev	7	99	December 20, 2019
Tvolucion - Xpath Problem Dev/API Corner plugin-dev	16	109	December 20, 2019
xpath confusion Dev/API Corner plugin-dev	3	67	December 20, 2019

xpath / CDATA

Related topics