Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-20-2017, 01:46 PM   #31
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by AnselmD View Post
If i add the words to the user defined dictionary in libre office (using the OLD_SPELL dictionary), they are still misspelled. I will check this with the default German dictionary later. Does anyone manage to add them, so they are not misspelled?
@AnselmD:

Try the following:

1. Unpack the utf-8 encoded latest version of the de_DE_frami dictionaries to

Code:
C:\Program Files\Sigil\hunspell_dictionaries
2. Extract the straight and curly files to the user_dictionaries folder. (Edit > Preferences > Open Preferences Location > user_dictionaries)

3. Select de_DE_frami as the default dictionary and straight as your user dictionary and open your test case file. (Make sure that only one user dictionary is selected.)
The words previously flagged as typos should no longer be flagged. Then uncheck straight and check curly; this should flag them again as typos.

I.e., you'll have to decide whether you want to save contracted words in your user dictionary with straight or curly quotes.

If you'd rather save contracted words in your user dictionary with curly apostrophes, either comment out the following lines in the affix file.

Code:
#ICONV 1
#ICONV ’ '
#OCONV 1
#OCONV ' ’
or, if you already have a large custom dictionary with contractions, simply replace all curly apostrophes in the user dictionary with straight ones.
Attached Files
File Type: zip de_frami_utf8.zip (1.10 MB, 194 views)
Doitsu is offline   Reply With Quote
Old 01-20-2017, 03:19 PM   #32
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Doitsu View Post
because the 1996 spelling reform officially sanctioned the use of English style possessive constructions with straight and curly apostrophes in German. (The pre-1996 spelling rules only allowed straight and curly apostrophes in contracted words.)

I don't think the new spelling rules forbids straight and curly apostrophes in German:
Duden | Apostroph
http://www.duden.de/sprachwissen/rec...geln/apostroph
AnselmD is offline   Reply With Quote
Advert
Old 01-20-2017, 03:33 PM   #33
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Doitsu View Post
@AnselmD:
or, if you already have a large custom dictionary with contractions, simply replace all curly apostrophes in the user dictionary with straight ones.
OK, thank you, this will works for me.
This is the same solution as mentioned in this thread before i asked my question.
I misunderstood it, i thought i must change my book text to straight apostrophes, but i only has to change my users dictionary.

So, thank you very much to all who answered to my question.
AnselmD is offline   Reply With Quote
Old 01-21-2017, 06:48 AM   #34
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
FWIW, apostrophes should be curly. It is only a small sub-set of computer boffins who hold otherwise. The straight mark is properly reserved for mathematical notations (feet, minutes, etc.).
Notjohn is offline   Reply With Quote
Old 01-21-2017, 08:01 AM   #35
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,457
Karma: 26645808
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Notjohn - The mathematical and musical notation symbols you're referring are properly called 'primes'.

In typography they are distinct from so-called straight/typewriter/programmer quotes and double quotes ( ' " ), (U0022 and U0027). Although that's what most people use.

There are single, double and triple primes, right leaning ( ′ ″ ‴ ), U2032-4, and left leaning ( ‵ ‶ ‷ ) U2035-7.

I think 'primes' are sometimes referred to 'ticks', or maybe they're something else.

BR
BetterRed is offline   Reply With Quote
Advert
Old 01-21-2017, 10:17 AM   #36
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
FWIW, if the dictionary .dic/.aff uses utf-8 and has the proper iconv ond oconv lines in the .aff, then even your personal/default wordlist can have and use curly single quotes.

KevinH
KevinH is offline   Reply With Quote
Old 01-21-2017, 10:31 AM   #37
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by KevinH View Post
FWIW, if the dictionary .dic/.aff uses utf-8 and has the proper iconv ond oconv lines in the .aff, then even your personal/default wordlist can have and use curly single quotes.
In that case my iconv/oconv syntax must be incorrect, because I get different results with user dictionaries with and without curly quotes and with and without iconv/oconv lines on my Windows machine.

I know that you're 100% convinced that you're right, but could you nevertheless follow the steps in this post with my files and my iconv/oconv lines and let me know whether you got different results and how I'd need to change the iconv/oconv lines so that they work with user lists with and without curly quotes?
Doitsu is offline   Reply With Quote
Old 01-21-2017, 11:03 AM   #38
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
I am only going by what the code says. Sigil copies the word to be checked and converts all smart single quotes to dumb single quotes when spellchecking a word. The words from user lists are then added to the Hunspell dictioanry using this same approach but the user lists should retain the original form of the word.

Those Hunspell iconv aff lines should convert all smart single quotes to dumb ones (just like Sigil does) upon input and that oconv lines should convert suggestions from dumb single quotes to smart ones.

All of this will only work if the actual encoding supports both characters. ISO8859-1 does not support smart single quotes. The oconv lines will not allow mixing of suggestions, it should simply change any dumb single quote to smart single quote. I will run some tests with my the newly created de_DE to validate this.

KevinH

Last edited by KevinH; 01-21-2017 at 11:06 AM.
KevinH is offline   Reply With Quote
Old 01-21-2017, 12:47 PM   #39
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
@Diotsu,

The modern de_DE frami dictionary has no prefixes or suffixes assigned in the .aff file that use the single quote (dumb or smart). Nor does it list a single quote as a possible char to even try inserting.

So after converting to UTF-8, I added the dumb single quote to WORDCHARS, TRY, and added the ICONV and OCONV sections to the aff file.

So a grep on my modifed de_DE.aff file for dumb single quotes shows the following:

Code:
grep \' de_DE.aff
TRY esijanrtolcdugmphbyfvkwqxzäüößáéêàâñESIJANRTOLCDUGMPHBYFVKWQXZÄÜÖÉ-.'
WORDCHARS ß-.'
ICONV ’ '
OCONV ' ’
But still no luck with any contractions in German.

So if there are no SFX flags with a single quote in them, then the only way words with contractions can be considered valid words is if they are included in the dictionary wordlist itself.

Unfortunately grepping for a single quote in the de_DE.dic shows the following: (and there were no words with smart single quotes in them at all)

Code:
grep \' de_DE.dic
Horsd'oeuvre/Sm
Ku'damm/ST
Xi'an/S
d'hondtsch/A
horsd'oeuvre/Sozm
so words like "halt's" with dumb or smart single quotes will always be marked as not spelled correctly.

The next test was to add either "halt's" or "halt’s" to my user wordlist.

Given Sigil always handles the role of iconv (in the sense it maps smart single quotes to their dumb version) and given the oconv lines I added I expected the suggestion for a mispelt word like "chalt's" and "chalt’s" to come back as "halt’s" (the smart single quote version).

That is exactly what I observed.

So the current de_DE_frami German dictionary really needs a lot of work:

- both .aff and dic need to be converted to utf-8
- the .aff file needs the proper iconv and oconv sections added
- a dumb single quote needs to be added to the TRY line
- a SFX to handle common contractions with a dumb single quote need to be added to the .aff file
- a list of the most commonly used contractions in modern German should be created, added to an "unmunched" wordlist and the entire new wordlist re-munched to create a new .dic file that uses the new SFX flag for contractions.

Not a bug anywhere in Sigil as far as I can tell. It is just the de_DE dictionary for modern German leaves a lot to be desired.

Hope this helps,

KevinH

Last edited by KevinH; 01-21-2017 at 01:40 PM.
KevinH is offline   Reply With Quote
Old 01-21-2017, 12:58 PM   #40
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
BTW, if a German speaker would create a wordlist of properly spelled german words with contractions (with the proper root word associated with it), we can try to create a proper SFX and then add those words manually (very carefully) to the dictionary file (without having to use the old munch and unmunch since those programs do not support most of the bells and whistles of more modern hunspell).

Otherwise the only other response is to simply tell uses to create their own user wordlist of these contractions to help improve the de_DE dictionary.

KevinH

Last edited by KevinH; 01-21-2017 at 01:02 PM.
KevinH is offline   Reply With Quote
Old 01-21-2017, 01:42 PM   #41
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,582
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by KevinH View Post
So the current de_DE_frami German dictionary really needs a lot of work:

- both .aff and dic need to be converted to utf-8
- the .aff file needs the proper iconv and oconv sections added
- a dumb single quote needs to be added to the TRY line
- a SFX to handle common contractions with a dumb single quote need to be added to the .aff file
- a list of the most commonly used contractions in modern German should be created, added to an "unmunched" wordlist and the entire new wordlist re-munched to create a new .dic file that uses the new SFX flag for contractions.Not a bug anywhere in Sigil as far as I can tell. It is just the de_DE dictionary for modern German leaves a lot to be desired.
Thanks for looking into this! Since fixing this dictionary would take too much time, I'd recommend that you don't include the updated de_DE_frami dictionary in future builds. The current de_DE dictionary is probably not much better, but so far there haven't been any complaints from German Sigil users.

Quote:
Originally Posted by KevinH View Post
BTW, if a German speaker would create a wordlist of properly spelled german words with contractions (with the proper root word associated with it), we can try to create a proper SFX and then add those words manually (very carefully) to the dictionary file (without having to use the old munch and unmunch since those programs do not support most of the bells and whistles of more modern hunspell).
IMHO, that would be overkill, since contractions are less frequently used in standard German than in English and this is most likely also the reason why the current dictionary maintainers didn't bother to create a proper SFX.

Anybody who wants to spellcheck German epubs with lots of contractions can simply use a custom word list.
Doitsu is offline   Reply With Quote
Old 01-21-2017, 03:33 PM   #42
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
@Doitsu,
Understood.

KevinH
KevinH is offline   Reply With Quote
Old 01-21-2017, 05:25 PM   #43
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,457
Karma: 26645808
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Doitsu View Post
Anybody who wants to spellcheck German epubs with lots of contractions can simply use a custom word list.
Will Sigil use a custom word list as a source for suggestions, I have a feeling it doesn't. But, I maybe thinking of other software that uses the hunspell dictionaries and similar spell checking dialogues to those in Sigil. I don't have Sigil on this device so I can't check.

@Doitsu - do you know of any utilities to add words to an existing hunspell dictionary. I've looked around a couple of times, all I've ever found were instructions on how to do it manually, which isn't exactly suited to occasional use.

I'm thinking of something that would work through a word list, prompting whether to add each word (discarded words could be put in a file for the tool's subsequent use), and for any additional properties needed for the .dic, the hyph_?.dic, and the .aff file entries - and optionally the .dat and the .idx files (curious - what uses them). A standalone utility, not something built into Sigil itself, or a plugin, or anything similar.

Assuming it exists, such a utility would help address 'word lists not used for suggestions' issue. To use the revised dictionary in Sigil one would need to add it to "...\sigil-ebook\sigil\hunspell_dictionaries" as a custom dictionary.

@KevinH - would be nice if Sigil could process the OXTs that one wants to use as custom dictionaries and extract what it needs, rather than having to do it manually. I screwed up several times when I first I wanted to add some.

BR
BetterRed is offline   Reply With Quote
Old 01-21-2017, 06:37 PM   #44
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Sigil add words in user wordlists to the current hunspell dictionary when first opened and so suggestions are generated from words from user wordlist words in Sigil.

Oxt files are just zip archives. So any zip utility can unpack them. All you need are the .dic and .aff files. The hyphenation and thesaurus dictionarys are not used or currently installed. So just unzip the .oxt file and install them.

You could use a plugin to automate this easily as well. Perhaps you can convince someone to throw one together.

KevinH
KevinH is offline   Reply With Quote
Old 01-21-2017, 07:51 PM   #45
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,457
Karma: 26645808
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by KevinH View Post
Sigil add words in user wordlists to the current hunspell dictionary when first opened and so suggestions are generated from words from user wordlist words in Sigil.
Had a gut feeling that Sigil might not have the issue, that's why I qualified with a "But...". Maybe it's calibre that has the issue - but its spellchecker is multi-lingual Ψ²

Quote:
Originally Posted by KevinH View Post
Oxt files are just zip archives. So any zip utility can unpack them. All you need are the .dic and .aff files. The hyphenation and thesaurus dictionarys are not used or currently installed. So just unzip the .oxt file and install them.
The hyphenation dictionaries are installed. And manual says they must be installed for additional standard dictionaries.

Quote:
To add other standard dictionaries, such as ones found at the OpenOffice Dictionaries site, download, extract, and copy the files, e.g. en_GB.​aff, en_GB.​dic, hyph_en_GB.dic to the "hunspell_dictionaries" directory location and restart Sigil.
If something were built in as suggested, new users wouldn't be tempted to put the files in the install directories as a couple of my non-Anglo colleagues did. They looked for something in Preferences->Spellcheck Dictionaries. They're - "RTFM! - what's that?, tl:dr? - absolutely" - millenials

@Doitsu - my enquiry re a utility still stands.

BR
Attached Thumbnails
Click image for larger version

Name:	3.JPG
Views:	216
Size:	92.9 KB
ID:	154386  
BetterRed is offline   Reply With Quote
Reply

Tags
bug report, feature request, punctuation, sigil, unicode

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spellcheck and some notes. brolny Sigil 0 11-24-2015 04:37 AM
SpellCheck - Abbreviation(?) Apostrophes Paulie_D Editor 10 01-08-2015 08:22 AM
Request for future spellcheck mrmikel Editor 1 03-21-2014 11:42 AM
Quick and Dirty Spellcheck? ManosHandsOfFate Workshop 3 03-07-2014 02:41 PM
SPELLCHECK NATION: Does SpellCheck have a dark side? cbaehr Self-Promotions by Authors and Publishers 10 11-07-2010 12:45 PM


All times are GMT -4. The time now is 09:32 AM.


MobileRead.com is a privately owned, operated and funded community.