edit - if none of the dashes were removed this sounds like a bug - open bug with your book and I can check it out.
The way that heuristics determines if a hyphen should be removed relies on using the book itself as a dictionary. This approach works pretty well, but is not a 100% guarantee that every hyphen that needs to be removed will be. It does pretty much guarantee that any hyphens that should be kept will be, which is the more important thing IMHO.
It does do some rudimentary stemming of the words, so in your example 'im-mersed' would be shortened to 'immers', and the document would be checked to see if the text 'immers' existed in the book - I think what you'll find if you double-check your book is that immers can't be found anywhere.
The only way to improve on this further is to allow the user to specify an external dictionary/wordlist, but there hasn't been all that much interest in further improving the feature.
|