View Single Post
Old 01-07-2016, 09:10 PM   #822
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by Divingduck View Post
Is it so simple?

What will you do with words like 3D printer in German language called 3-D-Drucker as one word? Count it as 3 words is definitely wrong for that language and this kind of exceptions happen a lot more. Guess, in other languages too. To cover this you will need a dictionary for each language and I am quite sure, you will not cover all exceptions as e.g. in German language there is no rule to prevent constructions with a "-" between words. This is often used for a better reading of long word constructions.
It isn't obvious from JSWolf's post, but the last one is actually an "en-dash", not a hyphen. I have no idea whether that should be considered a word delimiter or word joiner.

The method Kovid has mentioned for word counting accepts a locale. That should sort the differences out between the languages.
davidfor is offline   Reply With Quote