Quote:
Originally Posted by davidfor
So, now we have three different counts from three different algorithms or applications. Is there any reason why you think the Notepad++ TextFX plugin is the correct one? Can you point to where it defines what a word is? I can't find anything and I have no reason to think that it is more correct than the two built into calibre.
If you want an example, than look at the post YOU made last year that started all this. Your complaint was about the following being counted wrong:
You stated they should be six words, not three. I have just tested with all three algorithms; ICU, older count pages and TextFX. Only the ICU counts them as separate words.
So, are you going back on your original claim and you think these should be considered as one word?
And I can't work with "a book I have has a problem". I need to see the book and know what the problem is. Or a chunk of text and a description of how it is being counted wrong.
|
Ignore the count from Notepad++. The Word Count doesn't work properly. It takes word—word as one word and it takes word — word as three words. But given that the file I used had no spaces around the em dashes, that means the count for NotePad++ is off on the low side. If this isn't a problem, just
Let It Be.