need help with xpaths

trying to work on my second plugin, this will be the first using xpaths can someone please help me?



Here is an example page



http://www.nfl.com/videos/new-orleans-saints



If I wanted to get the titles: Bush’s future in New Orleans, No joke: Bush’s tweets offend fans, Saints’ draft evaluation from that page what xpath would i use?

same question for the corresponding url, subtitle, and image.



If you can even explain how you figured out the xpath that would be greatly appreciated. xpaths are one thing im really stuck understanding. I try to use the firefox xpath plugin but it doesn’t really help much.

Hello John!



You can loop over:


//div[@id="videos-list"]/ul[@class="list-items"]/li


then use **.//h3** for title, **.//h3/a** for the link, **.//p[@class="date"]** for the date, **.//p[last()]** for summary and **.//a/img** for the thumbnail:


for video in HTML.ElementFromURL("http://www.....", errors='ignore').xpath('//div[@id="videos-list"]/ul[@class="list-items"]/li'):<br />
    title = video.xpath('.//h3')[0].text<br />
    url = video.xpath('.//h3/a')[0].get('href')<br />
    date_text = video.xpath('.//p[@class="date"]')[0].text<br />
    summary = video.xpath('.//p[last()]')[0].text<br />
    thumb_url = video.xpath('.///a/img')[0].get('src')<br />
<br />
    ...<br />
    ...<br />




To figure out what part of the HTML page you need, you check out the source of a webpage, find the element you need and then follow the DOM tree upwards.


Thanks sander1, this didn't quite work for me, but with a bit of tinkering with it I have it working. Many thanks for pointing me in the right direction. Biggest problem before i posted yesterday was getting it to loop and bring in each video, rather than just the first result.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.