View Single Post
Old 06-30-2017, 03:35 PM   #1120
cryzed
Evangelist
cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.cryzed ought to be getting tired of karma fortunes by now.
 
cryzed's Avatar
 
Posts: 408
Karma: 1050547
Join Date: Mar 2011
Device: Kindle Oasis 2
Hey, I modified the code a bit. My plan was to speed up the APNX-accurate algorithm, but unfortunately even my alternative version only performs at around the same speed (+/- 1 second). Some things it might handle better are books with strange and/or broken markup (although Calibre should prevent that during the conversion-step).

I also made slight changes in _read_epub_contents() and _extract_body_text() that avoid some unicode conversion steps and that use the re module a bit more efficiently (regex_object.search() instead regex_object.findall(), re.sub() instead of str.replace()).

I didn't want to just throw it away, maybe you'll find it interesting or useful in some way: count-pages.patch.

Thanks for your plugin!
cryzed is offline   Reply With Quote