View Single Post
Old 06-02-2011, 08:45 PM   #27
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
If you have looked at the APNX calculation as I have you will understand when it will go badly wrong - and that is as I said if the book uses <div> tags instead of <p> tags to define paragraphs.

As an experiment, convert your mobi to epub, make sure the plugin is setup to look at epub first (or create a new book record with just the epub on it), and then get the page count. I would be interested to hear the number you get. And as I said above you should look at the detail of the page to find out what your paragraph delimiters are.

The ePub calculation is based on the APNX one, but I added some extra tweaks to attempt to detect when that scenario I described above. Perhaps if you get a number that is more reasonable user_none might be convinced to change his apnx calculation to do something similar - or something even better, in which case I can steal it

There is still one other situation which I know both will fail on - and that is books which use <br/><br/> tags to define the end of a paragraph. i.e. no known non-closing tag around the paragraph. In that situation my epub calculation will go horribly wrong too. There is only so much we can do - if people/tools do non-standard things in formatting the books then downstream approximation hacks like this will get tripped up. If you fix the book in Sigil for instance and properly enclose the paragraphs with some regex find/replacing then you can run the page count afterwards.

Last edited by kiwidude; 06-02-2011 at 08:47 PM.
kiwidude is online now   Reply With Quote