Hey guys,
What is the best way to count words and unique words in EPUB/AZW3 files?
To demonstrate what I mean I will use following book:
http://www.feedbooks.com/book/673/david-copperfield
Report function in Calibre shows 357 436 words and 16 519 unique words.
feedbooks.com shows 358,632 words (doesn't provide number of unique words)
wordcounttools.com shows 359 260 words (small error is probably caused by inclusion of "About the Author, etc.") and 29 406 unique words.
easycalculation.com shows 359 408 words and 20 466 unique words.
planetcalc.com shows 364 872 words and 17 858 unique words.
My question is, which source provides most accurate number of unique words? I suspect there might be different methodology to determine number of unique words.