Quote:
Originally Posted by davidfor
I have to go with "wow". I happened to have a 1.4 million word book and it took 57 minutes on my work machine with ICU selected and 9 seconds without. I would expect the ICU method to be slower, but, I wasn't expecting that. I think it is something new as I did run the count on a 6 million word book a month or so ago, and I'm sure I was using the ICU count. Both the counts are done using methods built into calibre. I'll have to check an older version to see if it is different.
As to the difference in the result, that is to do with how the two algorithms define a word. This has been "discussed" in this thread a few times. No one won.
|
I tried this at home tonight, and the results were completely different. For the same book, it took less than 20 seconds. And was a little faster than the non-ICU word count.
I don't think it is the hardware. Home is Windows 10 laptop, with an I7 and 16GB of memory. At work it was a desktop running Ubuntu, which I think has 16GB of memory, I don't know the CPU, but, it isn't slow otherwise.