Quote:
Originally Posted by davidfor
That probably explains the difference for Windows machines. But, the original report explicitly mentioned using the ICU option for the word count. And my testing at work on a Linux box was only for the word count. I can't do much for this.
|
Ahh, true. We (I!!) have gone off on a tangent. I've confirmed that normal vs ICU doesn't really affect run time for me.
Quote:
I'll have look when I have time at the other stats. But, that isn't likely to happen soon.
|
Almost all the time is spent counting syllables. I added a bit of timing stuff to the plugin (yay, my first actual working change to anything Calibre related!) and I see in my log of Oscar Wilde
Code:
count syllables in all words
.... count syllables done --- 1539.17500019 seconds ---
and total run time was just over 25 minutes again.
If I insert a
return 1607495 right before the
for word in words: loop in
nltk_lite/textanalyzer.py, then it only takes 29 seconds instead of nearly half an hour.
Counting syllables is difficult?!