View Single Post
Old 01-06-2016, 10:54 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,579
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It's not a bug in calibre. the get_wordcount function is not designed for accurate word counts. It is used only in heuristics to try to auto detect chapter boundaries based on approximate word counts. I have no idea why the count pages plugin uses that function. Instead it should be using the ICU word iterator functions, for examples of their use, see break_iterator.py
kovidgoyal is offline   Reply With Quote