01-20-2017, 01:46 PM | #31 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Try the following: 1. Unpack the utf-8 encoded latest version of the de_DE_frami dictionaries to Code:
C:\Program Files\Sigil\hunspell_dictionaries 3. Select de_DE_frami as the default dictionary and straight as your user dictionary and open your test case file. (Make sure that only one user dictionary is selected.) The words previously flagged as typos should no longer be flagged. Then uncheck straight and check curly; this should flag them again as typos. I.e., you'll have to decide whether you want to save contracted words in your user dictionary with straight or curly quotes. If you'd rather save contracted words in your user dictionary with curly apostrophes, either comment out the following lines in the affix file. Code:
#ICONV 1 #ICONV ’ ' #OCONV 1 #OCONV ' ’ |
|
01-20-2017, 03:19 PM | #32 | |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Quote:
I don't think the new spelling rules forbids straight and curly apostrophes in German: Duden | Apostroph http://www.duden.de/sprachwissen/rec...geln/apostroph |
|
Advert | |
|
01-20-2017, 03:33 PM | #33 | |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Quote:
This is the same solution as mentioned in this thread before i asked my question. I misunderstood it, i thought i must change my book text to straight apostrophes, but i only has to change my users dictionary. So, thank you very much to all who answered to my question. |
|
01-21-2017, 06:48 AM | #34 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
FWIW, apostrophes should be curly. It is only a small sub-set of computer boffins who hold otherwise. The straight mark is properly reserved for mathematical notations (feet, minutes, etc.).
|
01-21-2017, 08:01 AM | #35 |
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@Notjohn - The mathematical and musical notation symbols you're referring are properly called 'primes'.
In typography they are distinct from so-called straight/typewriter/programmer quotes and double quotes ( ' " ), (U0022 and U0027). Although that's what most people use. There are single, double and triple primes, right leaning ( ′ ″ ‴ ), U2032-4, and left leaning ( ‵ ‶ ‷ ) U2035-7. I think 'primes' are sometimes referred to 'ticks', or maybe they're something else. BR |
Advert | |
|
01-21-2017, 10:17 AM | #36 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
FWIW, if the dictionary .dic/.aff uses utf-8 and has the proper iconv ond oconv lines in the .aff, then even your personal/default wordlist can have and use curly single quotes.
KevinH |
01-21-2017, 10:31 AM | #37 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
I know that you're 100% convinced that you're right, but could you nevertheless follow the steps in this post with my files and my iconv/oconv lines and let me know whether you got different results and how I'd need to change the iconv/oconv lines so that they work with user lists with and without curly quotes? |
|
01-21-2017, 11:03 AM | #38 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
I am only going by what the code says. Sigil copies the word to be checked and converts all smart single quotes to dumb single quotes when spellchecking a word. The words from user lists are then added to the Hunspell dictioanry using this same approach but the user lists should retain the original form of the word.
Those Hunspell iconv aff lines should convert all smart single quotes to dumb ones (just like Sigil does) upon input and that oconv lines should convert suggestions from dumb single quotes to smart ones. All of this will only work if the actual encoding supports both characters. ISO8859-1 does not support smart single quotes. The oconv lines will not allow mixing of suggestions, it should simply change any dumb single quote to smart single quote. I will run some tests with my the newly created de_DE to validate this. KevinH Last edited by KevinH; 01-21-2017 at 11:06 AM. |
01-21-2017, 12:47 PM | #39 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
@Diotsu,
The modern de_DE frami dictionary has no prefixes or suffixes assigned in the .aff file that use the single quote (dumb or smart). Nor does it list a single quote as a possible char to even try inserting. So after converting to UTF-8, I added the dumb single quote to WORDCHARS, TRY, and added the ICONV and OCONV sections to the aff file. So a grep on my modifed de_DE.aff file for dumb single quotes shows the following: Code:
grep \' de_DE.aff TRY esijanrtolcdugmphbyfvkwqxzäüößáéêàâñESIJANRTOLCDUGMPHBYFVKWQXZÄÜÖÉ-.' WORDCHARS ß-.' ICONV ’ ' OCONV ' ’ So if there are no SFX flags with a single quote in them, then the only way words with contractions can be considered valid words is if they are included in the dictionary wordlist itself. Unfortunately grepping for a single quote in the de_DE.dic shows the following: (and there were no words with smart single quotes in them at all) Code:
grep \' de_DE.dic Horsd'oeuvre/Sm Ku'damm/ST Xi'an/S d'hondtsch/A horsd'oeuvre/Sozm The next test was to add either "halt's" or "halt’s" to my user wordlist. Given Sigil always handles the role of iconv (in the sense it maps smart single quotes to their dumb version) and given the oconv lines I added I expected the suggestion for a mispelt word like "chalt's" and "chalt’s" to come back as "halt’s" (the smart single quote version). That is exactly what I observed. So the current de_DE_frami German dictionary really needs a lot of work: - both .aff and dic need to be converted to utf-8 - the .aff file needs the proper iconv and oconv sections added - a dumb single quote needs to be added to the TRY line - a SFX to handle common contractions with a dumb single quote need to be added to the .aff file - a list of the most commonly used contractions in modern German should be created, added to an "unmunched" wordlist and the entire new wordlist re-munched to create a new .dic file that uses the new SFX flag for contractions. Not a bug anywhere in Sigil as far as I can tell. It is just the de_DE dictionary for modern German leaves a lot to be desired. Hope this helps, KevinH Last edited by KevinH; 01-21-2017 at 01:40 PM. |
01-21-2017, 12:58 PM | #40 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
BTW, if a German speaker would create a wordlist of properly spelled german words with contractions (with the proper root word associated with it), we can try to create a proper SFX and then add those words manually (very carefully) to the dictionary file (without having to use the old munch and unmunch since those programs do not support most of the bells and whistles of more modern hunspell).
Otherwise the only other response is to simply tell uses to create their own user wordlist of these contractions to help improve the de_DE dictionary. KevinH Last edited by KevinH; 01-21-2017 at 01:02 PM. |
01-21-2017, 01:42 PM | #41 | ||
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Quote:
Anybody who wants to spellcheck German epubs with lots of contractions can simply use a custom word list. |
||
01-21-2017, 03:33 PM | #42 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
@Doitsu,
Understood. KevinH |
01-21-2017, 05:25 PM | #43 | |
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
@Doitsu - do you know of any utilities to add words to an existing hunspell dictionary. I've looked around a couple of times, all I've ever found were instructions on how to do it manually, which isn't exactly suited to occasional use. I'm thinking of something that would work through a word list, prompting whether to add each word (discarded words could be put in a file for the tool's subsequent use), and for any additional properties needed for the .dic, the hyph_?.dic, and the .aff file entries - and optionally the .dat and the .idx files (curious - what uses them). A standalone utility, not something built into Sigil itself, or a plugin, or anything similar. Assuming it exists, such a utility would help address 'word lists not used for suggestions' issue. To use the revised dictionary in Sigil one would need to add it to "...\sigil-ebook\sigil\hunspell_dictionaries" as a custom dictionary. @KevinH - would be nice if Sigil could process the OXTs that one wants to use as custom dictionaries and extract what it needs, rather than having to do it manually. I screwed up several times when I first I wanted to add some. BR |
|
01-21-2017, 06:37 PM | #44 |
Sigil Developer
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Sigil add words in user wordlists to the current hunspell dictionary when first opened and so suggestions are generated from words from user wordlist words in Sigil.
Oxt files are just zip archives. So any zip utility can unpack them. All you need are the .dic and .aff files. The hyphenation and thesaurus dictionarys are not used or currently installed. So just unzip the .oxt file and install them. You could use a plugin to automate this easily as well. Perhaps you can convince someone to throw one together. KevinH |
01-21-2017, 07:51 PM | #45 | |||
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Quote:
Quote:
@Doitsu - my enquiry re a utility still stands. BR |
|||
Tags |
bug report, feature request, punctuation, sigil, unicode |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spellcheck and some notes. | brolny | Sigil | 0 | 11-24-2015 04:37 AM |
SpellCheck - Abbreviation(?) Apostrophes | Paulie_D | Editor | 10 | 01-08-2015 08:22 AM |
Request for future spellcheck | mrmikel | Editor | 1 | 03-21-2014 11:42 AM |
Quick and Dirty Spellcheck? | ManosHandsOfFate | Workshop | 3 | 03-07-2014 02:41 PM |
SPELLCHECK NATION: Does SpellCheck have a dark side? | cbaehr | Self-Promotions by Authors and Publishers | 10 | 11-07-2010 12:45 PM |