MobileRead Forums - View Single Post

kiwidude · 07-15-2012, 09:22 AM

Hi Charles,

Yeah I had considered a fallback to auto-number the series index based on whether they have a series name. But that has a few problems - such as when FF list a book in a series that is written with other authors and only show the book written by that author - it would always make it "number 1" when it isn't. So it really needs the associated number off the page.

Here is the URL being parsed in this example above:
http://www.fantasticfiction.co.uk/p/james-patterson/

The parent expression I am using to identify only the titles on the page that are of interest is:
//div[@class="sectionleft"]/a[contains(@href,".htm")]

You will see that unfortunately there is no true "parent" for each "row". There are just a number of div sections for each series or grouping of titles, with a title contained within the a href. Hence why I am using that <a> tag as my row identifier and then grabbing data relative to that.

I've attached a new version 0.2 below - this adds the Pubdate implementation and fixes a couple of bugs.

07-15-2012, 09:22 AM	#24
kiwidude Calibre Plugins Developer Posts: 4,637 Karma: 2162064 Join Date: Oct 2010 Location: Australia Device: Kindle Oasis	Hi Charles, Yeah I had considered a fallback to auto-number the series index based on whether they have a series name. But that has a few problems - such as when FF list a book in a series that is written with other authors and only show the book written by that author - it would always make it "number 1" when it isn't. So it really needs the associated number off the page. Here is the URL being parsed in this example above: http://www.fantasticfiction.co.uk/p/james-patterson/ The parent expression I am using to identify only the titles on the page that are of interest is: //div[@class="sectionleft"]/a[contains(@href,".htm")] You will see that unfortunately there is no true "parent" for each "row". There are just a number of div sections for each series or grouping of titles, with a title contained within the a href. Hence why I am using that <a> tag as my row identifier and then grabbing data relative to that. I've attached a new version 0.2 below - this adds the Pubdate implementation and fixes a couple of bugs. Last edited by kiwidude; 07-15-2012 at 04:41 PM. Reason: Removing attachment as later version in this thread