View Single Post
Old 07-15-2012, 06:25 AM   #22
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Ok, here is an xpath "challenge" for someone (I need to go do some other things so if someone solves it for me in the meantime I shall be happy!)... lets say you have this html:
Code:
          <strong>Alex Cross</strong>
          <br>
          1.
          <a href="/p/james-patterson/along-came-spider.htm">Along Came a Spider</a>
          <span class="year">
            (
            <a href="/years/1992.htm">1992</a>
            )
          </span>
          <br>
          2.
          <a href="/p/james-patterson/kiss-girls.htm">Kiss the Girls</a>
          <span class="year">
            (
            <a href="/years/1994.htm">1994</a>
            )
          </span>
          <br>
Now lets say that you are iterating through each book in that page, using the <a> tag for the title above as your "root". Then you can extract the following with xpath:

title: text()
pubdate: following-sibling::span[@class="year"]/a/text()
series name: ../strong/text()
series #: ???

For series number I thought I could do something like:
preceding-sibling::text()

but that doesn't give me any results. Any other suggestions?

Last edited by kiwidude; 07-15-2012 at 06:30 AM.
kiwidude is offline   Reply With Quote