View Single Post
Old 07-15-2012, 09:22 AM   #24
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,223
Karma: 1333994
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
Hi Charles,

Yeah I had considered a fallback to auto-number the series index based on whether they have a series name. But that has a few problems - such as when FF list a book in a series that is written with other authors and only show the book written by that author - it would always make it "number 1" when it isn't. So it really needs the associated number off the page.

Here is the URL being parsed in this example above:
http://www.fantasticfiction.co.uk/p/james-patterson/

The parent expression I am using to identify only the titles on the page that are of interest is:
//div[@class="sectionleft"]/a[contains(@href,".htm")]

You will see that unfortunately there is no true "parent" for each "row". There are just a number of div sections for each series or grouping of titles, with a title contained within the a href. Hence why I am using that <a> tag as my row identifier and then grabbing data relative to that.

I've attached a new version 0.2 below - this adds the Pubdate implementation and fixes a couple of bugs.

Last edited by kiwidude; 07-15-2012 at 04:41 PM. Reason: Removing attachment as later version in this thread
kiwidude is offline   Reply With Quote