Thread: check spelling
View Single Post
Old 04-21-2014, 09:55 AM   #63
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,649
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by kovidgoyal View Post
@jackie_w: calibre uses the ICU word break iteration algorithm, which as far as I recall, splits up most hyphenated words into two words (the details are language dependent), so, for example, abc-def will show up in the words list as two words, abc and def

See http://userguide.icu-project.org/boundaryanalysis for details
that's a valuable link - explains some things I've been puzzling about wrt leading & trailing apostrophes.

At http://www.unicode.org/reports/tr29/#WB14 there is this with respect to word boundaries and hyphens

Quote:
The correct interpretation of hyphens in the context of word boundaries is challenging ... it is better overall to keep the hyphen out of the default definition
Is that to be interpreted as... a hyphen should or should not constitute a word boundary... I'm inclined to read it as should not.

BR
BetterRed is offline   Reply With Quote