I hope someone can help me… I’m confused about why this isn’t working. I am trying to use xpath and then loop thru elements to scrape info.
This works, but then I’m at a lower level node than I want:
<br />
content = XML.ElementFromURL(pageUrl, True)<br />
for item in content.xpath('//div[@class="column span-16 item"]/h1'):<br />
title = item.text<br />
<br />
This doesn't work but I think it should:
<br />
content = XML.ElementFromURL(pageUrl, True)<br />
for item in content.xpath('//div[@class="column span-16 item"]'):<br />
title = item.xpath('./h1').text<br />
<br />
It blows up on the ".text" for some reason. Here is the page I'm trying to parse:
[http://tekpub.com/productions?tag=free](http://tekpub.com/productions?tag=free)
For that example, the the broken code should actually be finding a title of "Concepts" for the first match. There's more I need to scape but I get the same problem for the rest as well. Any ideas what that wouldn't work?
You can see the [whole thing here](http://github.com/jcoffman/PlexTekPub) if you want/need.
Thanks in advance!