View Single Post
Old 06-17-2015, 05:24 PM   #26
GeoffR
Wizard
GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.GeoffR ought to be getting tired of karma fortunes by now.
 
GeoffR's Avatar
 
Posts: 3,821
Karma: 19162882
Join Date: Nov 2012
Location: Te Riu-a-Māui
Device: Kobo Glo
I found that the Dutch hyphenation dictionary included in the Kobo firmware is just a copy of the openoffice one available here.

There is a hyphenation program available here that can be used to test the hyphenation of words using the hyphenation dictionary.

Using the above program with the Dutch hyphenation dictionary (with no HYPHENMIN values, exactly as it comes with the Kobo firmware) gives the following hyphenations:

onderwerp on=der=werp
onderwerp, on=der=wer=p,

So the hyphenation onderwer-p, is valid according to that dictionary.

But adding LEFTHYPHENMIN 3 and RIGHTHYPHENMIN 3 to the dictionary gives the hyphenations:

onderwerp onder=werp
onderwerp, onder=werp,

so I hoped that the HYPHENMIN additions would fix the problem, but there seems to be more to it.


Edit: There are rules such as kam1p and wer1p in the Dutch hyphenation dictionary which seem to be responsible for these strange hyphenations, removing them fixes these paricular cases but I don't know Dutch so I don't know what other problems removing them would cause.

But it does seem that stripping the trailing punctuation from the words before hyphenating them would solve all these problems. The dictionaries have some rules to handle punctuation, but they only seem to be for apostrophes and such that are a part of the word, not other leading and trailing punctuation. It doesn't make sense to me that trailing punctuation should affect the result of the hyphenation algorithm.

Last edited by GeoffR; 06-17-2015 at 05:59 PM. Reason: There are rules ...
GeoffR is offline   Reply With Quote