01-21-2017, 08:02 PM | #46 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
The manual is old. The hyphenation dictionaries have not been removed from the install (yet) but nothing in Sigil uses them. In fact, they we improperly being loaded in the hunspell spellchecker but proper hyphenation dictionaries have hyphenation rule chars including digits embedded in them and are not valid words in and of themselves.
The only reason I haven't removed them yet, is I have considered adding a hyphenation library to Sigil, but am unconvinced it is needed. I will fix that when the documentation github site opens. That said, the best way to handle dictionary installation is with a plugin to extract it, parse the .xcu xml to get the file names and copy the files. |
01-22-2017, 04:41 AM | #47 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
The best and easiest solution for end-users is to add words to a custom word list. IMHO, such a single-use utility would be overkill. The fact that there's no standalone GUI Editor for OpenOffice/LibreOffice Hunspell dictionaries, also seems to indicate that the majority of end-users are quite happy with the default dictionaries, even though some of them are actually somewhat buggy as KevinH found out. While we're at the topic, there is one relatively safe AFF file hack for getting better suggestions for OCRed text, but I definitely wouldn't recommend any other changes to Hunspell dictionaries. |
|
Advert | |
|
01-22-2017, 09:32 AM | #48 |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
If the user dictionary uses the .aff file from the chosen dictionary, it is not language independent.
Shouldn't there be different ones for German, English etc.? e.g. mydic_de, mydic_en iconv and oconv: are converting the curly and straight apostrophes, is this necessary for UTF-8 or is this old stuff from ISO8859-1? |
01-22-2017, 10:02 AM | #49 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
needed for all dictionaries that do not want to duplicate contractions and possessives. The only way it can be used is if the characters exist in the current encoding.
As for a dictionary .dic editor, it is simply not easy to do due to the need in most languages for prefix and suffix compression to make the working set size viable. Yes you can create multiple user wordlists and they should as a general rule match the main dictionary language being used. One good exception is to include foreign words commonly used in another language. For example my user wordlists include some latin terms and abbreviations, some french terms, etc. I also have a scientific word list that has a number of latin terms as well. |
01-22-2017, 11:56 AM | #50 |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Is the hunspell algorithm designed to deal with more than one dictionary?
At this command line tool, it is possible to select several dictionaries, i did not test, if this really works (-d parameter): hunspell(1) - Linux man page https://linux.die.net/man/1/hunspell |
Advert | |
|
01-22-2017, 12:12 PM | #51 | |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
FWIW, the modern German dictionary would mark "geht's" as incorrect since it is not in the wordlist as far as I can tell.
Technically the apostrophe (single quotes) is needed and correct in the following line, is it not? Quote:
|
|
01-22-2017, 12:19 PM | #52 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
The hunspell commandline tool may support it but it is very different from how the hunspell library is used inside Sigil. Right now Sigil supports one main hunspell dictionary (you can select it and change it anytime you want) and multiple user based wordlists.
Calibre supports multiple language dictionaries open at once and smartly uses xhtml lang attributes to know what language to check each word in. There is also varlog's mlspell Sigil branch that adds that to Sigil but it has not been accepted/merged yet due to issues on how to do spellchecking on the fly during live editing with highlighting in multiple languages when only that line of context is provided and not the entire document. KevinH |
01-22-2017, 12:24 PM | #53 | |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Quote:
So, the Duden (an important dictionary of the German language) (Duden - Wikipedia https://en.wikipedia.org/wiki/Duden) says: Duden | Apostroph http://www.duden.de/sprachwissen/rec...geln/apostroph Wie gehts (auch: geht's) dir? This means, both is correct! |
|
01-22-2017, 12:31 PM | #54 |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
I took a look into the old Duden (old spelling):
Geht's gut? The apostrophe is a must. |
01-22-2017, 12:49 PM | #55 |
Sigil Developer
Posts: 7,647
Karma: 5433388
Join Date: Nov 2009
Device: many
|
So for the modern German dictionary the apostrophe is not needed for "gehts" vs "geht's" which is why it is left out of the dictionary, but for old German, you really must use "geht's" instead of "gehts".
So our current German dictionary is okay in that regard. Thanks, KevinH |
01-22-2017, 12:59 PM | #56 |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Apostroph – Wikipedia
https://de.wikipedia.org/wiki/Apostr...assungszeichen If you put it into google translation, it is (more or less) understandable: === Outlet character A function of the apostrophe is the marking of omitted letters; Predominantly in the transcription of spoken language, especially in words that would otherwise be difficult to read or misleading: Heute ist’s kalt. – Heute ist es kalt. (It's cold today. - Today it is cold.) Hast du noch ’nen Euro? also: Hast du noch nen Euro? – Hast Du noch einen Euro? Das ist so’ne Sache. also: Das ist sone Sache. – Das ist so eine Sache. Was für ’n Blödsinn!/Kommen S’ nur herein! – Was für ein Blödsinn. Kommen Sie nur herein. For omissions in the word: D’dorf for Düsseldorf Lu’hafen for Ludwigshafen M’gladbach for Mönchengladbach Ku’damm for Kurfürstendamm E’ler for Eschweiler A’dam for Amsterdam; However: GMhütte for Georgsmarienhütte Occasionally the apostrophe is also used illegitimately in the composition preposition + of certain articles, for example, in’s, an’s, um’s, zu’r. . According to the valid rules, however, an apostrophe can only be placed if the composition without an apostrophe is "opaque" (for example mit’m Fahrrad). [18] Also unlawful is the apostrophe in the case of the ex post and sentence arrhythmic omission of the e of the ending in the 1st and 3rd person plural indicative of the present active as well as of the subjunctive I. ===== Auslassungszeichen Eine Funktion des Apostrophs ist die Kennzeichnung ausgelassener Buchstaben; vorwiegend in der Verschriftlichung gesprochener Sprache, vor allem bei Wörtern, die sonst schwer lesbar oder missverständlich wären: Heute ist’s kalt. – Heute ist es kalt. Hast du noch ’nen Euro? auch: Hast du noch nen Euro? – Hast Du noch einen Euro? Das ist so’ne Sache. auch: Das ist sone Sache. – Das ist so eine Sache. Was für ’n Blödsinn!/Kommen S’ nur herein! – Was für ein Blödsinn. Kommen Sie nur herein. Bei Auslassungen im Wortinnern: D’dorf für Düsseldorf Lu’hafen für Ludwigshafen M’gladbach für Mönchengladbach Ku’damm für Kurfürstendamm E’ler für Eschweiler A’dam für Amsterdam; jedoch: GMHütte für Georgsmarienhütte Gelegentlich wird der Apostroph regelwidrig auch bei der Zusammensetzung Präposition + bestimmter Artikel benutzt, beispielsweise in’s, an’s, um’s, zu’r. Nach den gültigen Regeln darf ein Apostroph aber nur gesetzt werden, wenn die Zusammensetzung ohne Apostroph „undurchsichtig“ wäre (beispielsweise mit’m Fahrrad).[18] Ebenfalls regelwidrig ist der Apostroph beim vers- und satzrhythmischen Wegfall des e der Endung -en in der 1. und 3. Person Plural Indikativ des Präsens Aktiv sowie des Konjunktivs I. |
01-22-2017, 01:19 PM | #57 | |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Quote:
And even with an old spelling dictionary it is misspelled. And as you can see, in this German Learning Course for beginners they say: Karin: Hallo Eva! Wie geht’s Deutsch üben - Einstieg - Hallo, wie geht es dir?*-*Goethe-Institut* http://www.goethe.de/lrn/prj/wnd/deu...wg/deindex.htm I do not say, it should be solved in Sigil, because i think it does not work in any program which uses Hunspell. Or someone has to contact the maintainer of the dictionary. I think they are mentioned in the .aff file. |
|
01-22-2017, 02:10 PM | #58 |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
For your interest, I took a quick look at some of this German ebooks:
ePub Books - MobileRead Forums https://www.mobileread.com/forums/fo...play.php?f=130 "This work is assumed to be in the Life+70 public domain OR the copyright holder has given specific permission for distribution. " So the most of them uses Old Spelling and they uses apostrophes. |
01-22-2017, 06:38 PM | #59 | |
null operator (he/him)
Posts: 20,575
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
"Add <word from list> ?" N (word gets written to discard list) Back when pragmatics trumped perfection, PROFS/DISSOS (or something similar) had a dictionary creator along these lines. Algol springs to mind so it might have on MCP - salad days."Add <next word from list> ?" Y "A series of questions to create the affix entries" Not the whole enchilada, but a practical subset. BR |
|
01-23-2017, 05:23 AM | #60 | |
Zealot
Posts: 105
Karma: 10
Join Date: Oct 2013
Device: none
|
Quote:
hunspell -d de_DE_OLDSPELL /cygdrive/c/books/ApostropheTest.txt (see -H The input file is in SGML/HTML format. ) hunspell(1) - Linux man page https://linux.die.net/man/1/hunspell |
|
Tags |
bug report, feature request, punctuation, sigil, unicode |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Spellcheck and some notes. | brolny | Sigil | 0 | 11-24-2015 04:37 AM |
SpellCheck - Abbreviation(?) Apostrophes | Paulie_D | Editor | 10 | 01-08-2015 08:22 AM |
Request for future spellcheck | mrmikel | Editor | 1 | 03-21-2014 11:42 AM |
Quick and Dirty Spellcheck? | ManosHandsOfFate | Workshop | 3 | 03-07-2014 02:41 PM |
SPELLCHECK NATION: Does SpellCheck have a dark side? | cbaehr | Self-Promotions by Authors and Publishers | 10 | 11-07-2010 12:45 PM |