Quote:
Originally Posted by user_none
My parser only looks at the amount of visible text. Each page is comprised of 32 lines and each line can have up to 70 characters. A paragraph always starts a new line. The accuracy can be increased by adding support for handling <div class="mbp_pagebreak" /> and <br> tags.
One of my test books at page 105 mapped 1 to 1 to the print version. I did not do extensive testing of the other pages. Other books mapped very closely to the print edition.
Over all the more accurate APNX generator gives much closer results. With handling for the two additional elements OCRed texts and ones that use the same type setting as their print counter parts should give nearly 1 to 1 page mappings.
|
Outstanding.
Just getting to the "only looks at the amount of visible text" is awesome, all by itself.
Quote:
Originally Posted by user_none
Now for the big question. Is doubling the time it takes to transfer the book to the device worth a more accurate mapping? The mapping will be thrown off if the print book physical dimensions are different than the average paper back size I'm using. So a hard cover, larger or smaller paper back will cause the mapping to be off.
Also, if it is worth the extra time for using the accurate parser would it be worth increasing the time even more by changing the parser to accommodate the two previously mentioned elements and make it even more accurate?
|
It's only a matter of seconds, and just one time, so for me, it's easily worth it. But having the option means you can't lose. No one can!
You're giving the best of all worlds to everyone.