Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-23-2017, 05:42 AM   #61
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by BetterRed View Post
My 'problem' is that standard dictionaries (including paper ones) tend to be sparse when it comes to knowledge domain specific words.

BR

For German exists a supplement dictionary:
81,000 technical terms from mathematics, physics, chemistry, life sciences, IT, technology, industry, geosciences

Maybe there exist one for your language...

Deutsches Fachspezifisches Ergänzungswörterbuch German Specialist Supplemental Dictionary | Apache OpenOffice Extensions
http://extensions.openoffice.org/en/...tal-dictionary
AnselmD is offline   Reply With Quote
Old 01-23-2017, 06:14 AM   #62
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
Quote:
Originally Posted by AnselmD View Post
But "geht's" is not misspelled.
Amazing. I lived in Frankfurt the better part of a year; I must have said Wie gehts?! a thousand times, and never dreamed it was a contraction.
Notjohn is offline   Reply With Quote
Advert
Old 01-23-2017, 08:09 AM   #63
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
The following is a typical German Keyboard (T1):
You can hover with the mouse over the keys.
https://de.wikipedia.org/wiki/Tastat...C3.96sterreich



At the left side of the backspace key you can see
(`) Gravis
(´) Akute

(’) The (typographically correct) Apostroph is not accessible with a key, you have to type ALT+0146 (see https://de.wikipedia.org/wiki/Apostr...afisch_korrekt)

At the left side of the return key (together with #):
(') Apostroph (typographically incorrect/replacement character) (https://de.wikipedia.org/wiki/Apostr...rafisch_falsch)

So a typically German (like me) takes this (') (typographically incorrect) Apostroph
or
(´) Akute (which is wrong)

The Gernan Duden (Dictionary of the German language) uses the (typographically incorrect) (') Apostroph in the article about Apostroph:
http://www.duden.de/sprachwissen/rec...geln/apostroph

So i checked this with the only spellchecker of Duden:
Duden | Rechtschreibprüfung Online Betaversion
http://www.duden.de/rechtschreibpruefung-online

(´) akute U+00B4:
Halt´s Maul! Macht´s gut! Da gab´s keinen! Ich hab´s! Wie geht´s!

(’) Apostroph (typographically correct) U+2019 [Alt+0146:]; ’
Halt’s Maul! Macht’s gut! Da gab’s keinen! Ich hab’s! Wie geht’s!

(') replacement character for apostroph U+0027; ' (since HTML 5)
Halt's Maul! Macht's gut! Da gab's keinen! Ich hab's! Wie geht's!

As you can see the text is correct using the acute symbol and using a real apostrophe gives a misspelled word:

So they do it wrong!

I checked this again with Sigil and the German Hunspell dictionary. It does the same mistake. Additional I used the HTML-Entities ' ’ but they are converted to the real characters during saving.



This is the new version of the testcase epub:
https://www.mobileread.com/forums/at...1&d=1485176688

What can i do to make the differences between the acute and the tweo versions of apostroph visible in the book views of Sigil?

Attached Thumbnails
Click image for larger version

Name:	2017-01-23 13_45_47-Duden _ Rechtschreibprüfung Online Betaversion.png
Views:	524
Size:	49.2 KB
ID:	154428   Click image for larger version

Name:	2017-01-23 14_03_47-testcase.epub - epub2.0 - Sigil.png
Views:	664
Size:	16.4 KB
ID:	154430   Click image for larger version

Name:	2017-01-23 14_14_48-testcase.epub - epub2.0 - Sigil book view.png
Views:	449
Size:	11.0 KB
ID:	154432  
Attached Files
File Type: epub testcase.epub (2.0 KB, 277 views)

Last edited by AnselmD; 01-23-2017 at 08:17 AM.
AnselmD is offline   Reply With Quote
Old 01-23-2017, 08:14 AM   #64
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Notjohn View Post
Amazing. I lived in Frankfurt the better part of a year; I must have said Wie gehts?! a thousand times, and never dreamed it was a contraction.
You learned by doing and not by studying. Ask a German child and it will not know either.

Therefore in the written language I prefer using an apostrophe (as omit character) to make it visible: here is something missing.
Wie geht’s?!
Wie geht's?!
AnselmD is offline   Reply With Quote
Old 01-23-2017, 08:55 AM   #65
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by AnselmD View Post
What can i do to make the differences between the acute and the tweo versions of apostroph visible in the book views of Sigil?
Simply select a serif font (e.g. Charis SIL or Times New Roman) as the standard font in Sigil (Edit > Preferences > Appearance > Standard Font).

Last edited by Doitsu; 01-23-2017 at 08:59 AM.
Doitsu is offline   Reply With Quote
Advert
Old 01-23-2017, 09:41 AM   #66
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Doitsu View Post
Simply select a serif font (e.g. Charis SIL or Times New Roman) as the standard font in Sigil (Edit > Preferences > Appearance > Standard Font).
Thank you, Doitsu!

Times New Roman:


Charis Sil:


Now i can see i did some mistake adding Wie geht’s? to the test book, so again: testcase.epub:
https://www.mobileread.com/forums/at...1&d=1485182080
Attached Thumbnails
Click image for larger version

Name:	2017-01-23 15_31_03-testcase.epub_ - epub2.0 - Sigi_Time_New_Roman.png
Views:	477
Size:	7.0 KB
ID:	154434   Click image for larger version

Name:	2017-01-23 15_30_19-testcase.epub_ - epub2.0 - Sigil_Charis Sil.png
Views:	446
Size:	8.1 KB
ID:	154435  
Attached Files
File Type: epub testcase.epub (2.0 KB, 235 views)
AnselmD is offline   Reply With Quote
Old 01-23-2017, 10:22 AM   #67
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Only for interest:

This German Article explains, if an apostrophe with the new German spelling rules, has to be set or not.
There are four possibilities explained:
1) may be used (kann benutzt/gesetzt werden),
2) should not be used any more (sollte nicht mehr benutzt werden),
3) must not be used (nicht benutzt werden darf),
4) must be used (muss benutzt werden)

Zwiebelfisch-Abc: Der Gebrauch des Apostrophs im Überblick - SPIEGEL ONLINE
http://www.spiegel.de/kultur/zwiebel...-a-283781.html
AnselmD is offline   Reply With Quote
Old 01-23-2017, 10:32 AM   #68
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
@AnselmD: BTW, my LanguageTool Sigil plugin, will flag straight apostrophes and acute accents. For example, you'll get the following message for the first error in line 11:

Quote:
Line: 11 Col: 4. Context: (´) akute U+00B4: >>Halt´s<< Maul! Macht´s gut! Da gab´s keinen! Ich hab´... AKZENT_STATT_APOSTROPH:TYPOGRAPHY: Akzent statt Apostroph: Wollten Sie einen Apostroph verwenden? Suggestion(s): Halt's, Halt’s
I realize that the plugin implementation isn't exactly very user-friendly, but it works and it's relatively easy to write your own rules.
Doitsu is offline   Reply With Quote
Old 01-23-2017, 10:41 AM   #69
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by BetterRed View Post
Will Sigil use a custom word list as a source for suggestions, I have a feeling it doesn't. But, I maybe thinking of other software that uses the hunspell dictionaries and similar spell checking dialogues to those in Sigil. I don't have Sigil on this device so I can't check.

@Doitsu - do you know of any utilities to add words to an existing hunspell dictionary. I've looked around a couple of times, all I've ever found were instructions on how to do it manually, which isn't exactly suited to occasional use.

BR
@BetterRed: i did not try, but maybe it is interesting for you?

Proofing Tool GUI
http://marcoagpinto.cidadevirtual.pt...ngtoolgui.html
AnselmD is offline   Reply With Quote
Old 01-23-2017, 10:54 AM   #70
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
FWIW,
I looked at the OLDSPELL dictionary and it appears to be better at handling at least some of the contractions:

Code:
grep \' *.aff
TRY esijanrtolcdugmphbyfvkwqxzäüößáéêàâñESIJANRTOLCDUGMPHBYFVKWQXZÄÜÖÉ-.'
WORDCHARS ß-.'’
ICONV ’ '
OCONV ' ’
So its affix already has the single quote ' as part of the TRY and the proper ICONV and OCONV elements and WORDCHARS.

There is still no suffix rule (SFX) assigned in the .aff file but there are actual words in the dictionary that include an apostrophe.

Code:
 grep \' *.dic
d'hondtsch/A
geht's
gibt's
hat's
Horsd'oeuvre/Sm
horsd'oeuvre/Sozm
ist's
Ku'damm/ST
man's
wenn's
wird's
Xi'an/S
So at least some of them are supported. The only real issue is that with the proper SFX suffix rule created and used in the .aff file these entries would have been converted to single character suffix flags attached to their root words saving all of these extra entries.

So given you have found an encoding error in your own testcase and given you use the OLDSPELL dictioanry, you should see that "geht's" comes back as spell correctly with both the smart and dumb versions.

KevinH
KevinH is online now   Reply With Quote
Old 01-23-2017, 11:21 AM   #71
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by KevinH View Post

So given you have found an encoding error in your own testcase and given you use the OLDSPELL dictioanry, you should see that "geht's" comes back as spell correctly with both the smart and dumb versions.

KevinH
Yes, i checked it. And in this case it has the correct suggestion for geht’s (with curly apostrophe):


Btw, the "correctly" spelled words with the acute symbol are a bad example. If i take other special symbols acute symbol, they words are also correct.
Attached Thumbnails
Click image for larger version

Name:	2017-01-23 17_11_59-Spellcheck_gehts.png
Views:	463
Size:	15.1 KB
ID:	154437  
AnselmD is offline   Reply With Quote
Old 01-23-2017, 11:29 AM   #72
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,650
Karma: 5433388
Join Date: Nov 2009
Device: many
Yes, those words are not in the OLDSPELL dictionary. As I said, if someone can generate a list of the most commonly used words in German with contractions, I can at least add them to our current German dictionary and to the OLDSPELL one as well.
KevinH is online now   Reply With Quote
Old 01-23-2017, 12:37 PM   #73
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by Doitsu View Post
@AnselmD: BTW, my LanguageTool Sigil plugin, will flag straight apostrophes and acute accents. For example, you'll get the following message for the first error in line 11:
This is interesting.
Can i copy the (complete) error message to the clipboard. If there are suggestions i would like to copy them and not to type them again with a false apostrophe ;-)

I think the default workflow is:
1) correct spelling errors
2) test for grammar (with the language tool)

And if the spelling tool seems to make mysterious things: use the language tool.


Quote:
Originally Posted by Doitsu View Post
I realize that the plugin implementation isn't exactly very user-friendly, but it works and it's relatively easy to write your own rules.
Easy? I am afraid for side effects. If i change one rule, i may disable another rule.

The following one seems to be easy:
This is a false positive (mach’s gut, macht’s gut, mach’s besser; mach es gut, macht es gut, mach es besser), the rule detects this falsely as plural (no apostrophe for plural), but is only an omitting apostrophe:


Code:
 <rulegroup id="PLURAL_APOSTROPH" name="AGB's (AGBs) etc.">
            <antipattern>
                <token regexp="yes">King|Queen</token>
                <token regexp="yes">'|’|`|´|‘</token>
                <token>s</token>
                <token>College</token>
            </antipattern>
            <antipattern>
                <token>Queen</token>
                <token regexp="yes">'|’|`|´|‘</token>
                <token>s</token>
                <token>University</token>
            </antipattern>
            <rule>
                <!-- detected by GIRLS_DAY with better message -->
                <antipattern>
                    <token regexp="yes">Girl|Boy</token>
                    <token regexp="yes">'|’|`|´|‘</token>
                    <token>s</token>
                    <token>Day</token>
                </antipattern>
                <pattern>
                    <or>
                        <token>DVD</token>
                        <token postag_regexp="yes" postag="SUB:.*"><exception>Halt</exception></token>
                    </or>
                    <token regexp="yes">'|’|`|´|‘</token>
                    <token>s</token>
                </pattern>
                <message>Meinten Sie <suggestion>\1\3</suggestion> oder <suggestion>\1</suggestion>? Normalerweise wird im Deutschen vor einem Plural-s kein Apostroph gesetzt</message>
                <short>Normalerweise wird im Deutschen vor einem Plural-s kein Apostroph gesetzt</short>
                <example correction="AGBs|AGB">Es gelten die <marker>AGB`s</marker>.</example>
                <example correction="AGBs|AGB">Es gelten die <marker>AGB’s</marker>.</example>
                <example correction="Hits|Hit">Die Gruppe hatte viele <marker>Hit’s</marker>.</example>
                <!-- vgl. http://web.archive.org/web/20110214064020/http://www.duden.de/deutsche_sprache/sprachberatung/newsletter/archiv.php?id=54 -->
                <example>Der <marker>Girls’</marker> Day findet jährlich statt.</example>
            </rule>
        </rulegroup>
I can add an anti pattern:
Code:
  	    <antipattern>
                <token regexp="yes">Mach|Macht</token>
                <token regexp="yes">'|’|`|´|‘</token>
                <token>s</token>
            </antipattern>
Fine, now the false positive is gone.
Attached Thumbnails
Click image for larger version

Name:	2017-01-23 18_21_13-testcase.epub - epub2.0 - Sigil_plural_Macht's.png
Views:	455
Size:	36.0 KB
ID:	154444  
AnselmD is offline   Reply With Quote
Old 01-23-2017, 12:41 PM   #74
AnselmD
Zealot
AnselmD began at the beginning.
 
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
Quote:
Originally Posted by KevinH View Post
Yes, those words are not in the OLDSPELL dictionary. As I said, if someone can generate a list of the most commonly used words in German with contractions, I can at least add them to our current German dictionary and to the OLDSPELL one as well.
I have a book and the author is using a often such things, so i can deliver a subset.

But you are not the first one in the chain, i am thinking about to contact the contributor of the dictionaries. If you think this is the better option, i can to that?!

On the other hand, maybe it is easier to give it to you...

Last edited by AnselmD; 01-23-2017 at 01:26 PM.
AnselmD is offline   Reply With Quote
Old 01-23-2017, 01:05 PM   #75
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by AnselmD View Post
Can i copy the (complete) error message to the clipboard. If there are suggestions i would like to copy them and not to type them again with a false apostrophe ;-)
You can copy the message from the Validation window:

1. Click the message once. (The text should be displayed white on blue).
2. Press CTRL+C.
3. Select a text editor and press CTRL+V to paste the message.

(You can't select individual words in the Validation window. This is a Qt limitation.)
Doitsu is offline   Reply With Quote
Reply

Tags
bug report, feature request, punctuation, sigil, unicode


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Spellcheck and some notes. brolny Sigil 0 11-24-2015 04:37 AM
SpellCheck - Abbreviation(?) Apostrophes Paulie_D Editor 10 01-08-2015 08:22 AM
Request for future spellcheck mrmikel Editor 1 03-21-2014 11:42 AM
Quick and Dirty Spellcheck? ManosHandsOfFate Workshop 3 03-07-2014 02:41 PM
SPELLCHECK NATION: Does SpellCheck have a dark side? cbaehr Self-Promotions by Authors and Publishers 10 11-07-2010 12:45 PM


All times are GMT -4. The time now is 10:13 PM.


MobileRead.com is a privately owned, operated and funded community.