Ok, here is an xpath "challenge" for someone (I need to go do some other things so if someone solves it for me in the meantime I shall be happy!)... lets say you have this html:
Code:
<strong>Alex Cross</strong>
<br>
1.
<a href="/p/james-patterson/along-came-spider.htm">Along Came a Spider</a>
<span class="year">
(
<a href="/years/1992.htm">1992</a>
)
</span>
<br>
2.
<a href="/p/james-patterson/kiss-girls.htm">Kiss the Girls</a>
<span class="year">
(
<a href="/years/1994.htm">1994</a>
)
</span>
<br>
Now lets say that you are iterating through each book in that page, using the <a> tag for the title above as your "root". Then you can extract the following with xpath:
title:
text()
pubdate:
following-sibling::span[@class="year"]/a/text()
series name:
../strong/text()
series #:
???
For series number I thought I could do something like:
preceding-sibling::text()
but that doesn't give me any results. Any other suggestions?