View Single Post
Old 01-21-2017, 12:47 PM   #39
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,654
Karma: 5433388
Join Date: Nov 2009
Device: many
@Diotsu,

The modern de_DE frami dictionary has no prefixes or suffixes assigned in the .aff file that use the single quote (dumb or smart). Nor does it list a single quote as a possible char to even try inserting.

So after converting to UTF-8, I added the dumb single quote to WORDCHARS, TRY, and added the ICONV and OCONV sections to the aff file.

So a grep on my modifed de_DE.aff file for dumb single quotes shows the following:

Code:
grep \' de_DE.aff
TRY esijanrtolcdugmphbyfvkwqxzäüößáéêàâñESIJANRTOLCDUGMPHBYFVKWQXZÄÜÖÉ-.'
WORDCHARS ß-.'
ICONV ’ '
OCONV ' ’
But still no luck with any contractions in German.

So if there are no SFX flags with a single quote in them, then the only way words with contractions can be considered valid words is if they are included in the dictionary wordlist itself.

Unfortunately grepping for a single quote in the de_DE.dic shows the following: (and there were no words with smart single quotes in them at all)

Code:
grep \' de_DE.dic
Horsd'oeuvre/Sm
Ku'damm/ST
Xi'an/S
d'hondtsch/A
horsd'oeuvre/Sozm
so words like "halt's" with dumb or smart single quotes will always be marked as not spelled correctly.

The next test was to add either "halt's" or "halt’s" to my user wordlist.

Given Sigil always handles the role of iconv (in the sense it maps smart single quotes to their dumb version) and given the oconv lines I added I expected the suggestion for a mispelt word like "chalt's" and "chalt’s" to come back as "halt’s" (the smart single quote version).

That is exactly what I observed.

So the current de_DE_frami German dictionary really needs a lot of work:

- both .aff and dic need to be converted to utf-8
- the .aff file needs the proper iconv and oconv sections added
- a dumb single quote needs to be added to the TRY line
- a SFX to handle common contractions with a dumb single quote need to be added to the .aff file
- a list of the most commonly used contractions in modern German should be created, added to an "unmunched" wordlist and the entire new wordlist re-munched to create a new .dic file that uses the new SFX flag for contractions.

Not a bug anywhere in Sigil as far as I can tell. It is just the de_DE dictionary for modern German leaves a lot to be desired.

Hope this helps,

KevinH

Last edited by KevinH; 01-21-2017 at 01:40 PM.
KevinH is offline   Reply With Quote