![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Find whole words (and not only syllables)
Sigil performs hyphenation in the editor, that's a pretty feature.
But it seems to affect the "Find & Replace" functionality, as recently I can only look for syllables, not for whole words. For example: If I search for "Ratte", the result is: "expression not found", but if I enter: "Rat", it will find me the syllable, which is a little inconvenient. Is there a setting for this? |
![]() |
![]() |
![]() |
#2 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Are you sure you don't have Soft Hyphens hiding throughout your text? Soft Hyphens are an invisible character that only turns into a hyphen when it reaches the end of a line. A telltale sign of Soft Hyphens is when you get red squigglies on words that are spelled correctly... and/or when your search gets broken. See some of my posts on this (explaining why Soft Hyphens are awful + problems that may occur): Quote:
What you want to do is Find/Replace for the Soft Hyphen character, and remove them all. One easy way to do this is to go into Sigil: 1. Tools > Reports > Characters in HTML Files If you scroll through the list, you might see: Code:
Character: <----- (It looks like a hyphen, but it's actually an invisible character.) Decimal: 173 Hexadecimal: AD Entity Name: shy Entity Description: soft hyphen \ + Soft Hyphen into the Search box. 2. Make sure the Replace: box is completely blank. 3. Change Mode: to "Regex". 4. Press Count All to see if there are any hits. 5. Press Replace All. That should wipe all Soft Hyphens out of your book. Now you should have no problem with your normal searches. Last edited by Tex2002ans; 11-24-2020 at 11:41 AM. |
||
![]() |
![]() |
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,358
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Soft hyphens would be my guess. Hate those things.
|
![]() |
![]() |
![]() |
#4 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Spellchecking should now handle soft hyphens without barfing. Search and replace will not unless you use regex to deal with them. Another way to just see the soft hyphens is to add the soft hyphen entity (named or numeric as appropriate to your epub version) to Sigil's PreserveEntities setting.
That said, I urge you to remove the soft hyphens for general work. You can add them back after the book is polished and in near final form using calibre if you really want them. |
![]() |
![]() |
![]() |
#5 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Code:
\x{00AD} Last edited by Doitsu; 11-24-2020 at 01:21 PM. |
|
![]() |
![]() |
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Yes, you are all completely right! thank you for your help!
I wouldn't have thought that the epub contained soft hyphens, for I had built the epub myself, and, of course, without hyphens. But as the book lies for a considerable time in my file system, it might well be that once upon a time I had re-saved it from Calibre's file location (with soft hyphens). I might just have forgotten. Anyway: @DiapDealer: I'm comprehensive for anglophone users to "hate these things". But the german language is different: Imagine a word like "Dampfschifffahrtsgesellschaft" - my finger nails are warping at writing this - without hyphenation on an e-bool reader! That's to ugly by far. Thus, I estimate the HypenateThis! plugin very much, as its hyphenation results for the german language are in about 85 % correct. But your hints to detect soft hyphens in Sigil are really valuable to me in the future, as this issue occurs not so rarely. Thank you again! |
![]() |
![]() |
![]() |
#7 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,358
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
![]() People should buy readers that natively support hyphenation if they read content that would suffer without it (and it matters greatly to them). Never been a big fan of content providers deciding for readers what should be important to them. Last edited by DiapDealer; 11-24-2020 at 02:08 PM. |
|
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
I own a Kobo, and in fact, Kobo has a built-in hyphenation, but for reasons that I ignore, this hyphenation in the german language is rather awful, which means it isn't correct in perhaps 30 % of the examples. This is a real matter for "good" reading.
|
![]() |
![]() |
![]() |
#9 | ||||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
![]() Quote:
![]() Side Note: Another potential "weird character" issue is substituting Latin characters with Cyrillic ones: C (Latin) С (Cyrillic letter) It's mostly used in Phishing attacks: https://krebsonsecurity.com/2018/03/...ual-confusion/ and unscrupulous people who try to sell you dirt cheap "writing" (on sites like Fiverr) by copying already written works and swapping characters that visually look similar... trying to get around "plagiarism checks". Again, red squigglies, "broken search", and/or Sigil's Character Reports would give it away. Me too. Awful, awful things! Quote:
![]() And with devices like Kobo, you can insert your own hyphenation dictionary if needed, and then poof, you get properly hyphenated words without all the downsides! Quote:
He recently included Kobo hyphenation dictionaries for the German (DE) language. I believe some of the default languages use extremely high left/right numbers (sometimes as high as 5), which means words might not even get hyphenated unless 10+ characters long! Hyphenation Note: Different languages require different Left/Right minimums for proper typography (a trusted list can be found at Hyphenation.org): 2/3 (English) 2/2 (German) 2/2 (Spanish) 1/2 (Armenian) Depending on the language, they'll use 1-3. But 5??? Preposterous. Don't know what Kobo was thinking with those. Quote:
Breaking highlighting and dictionary support being two of the biggest that have bothered me lately: I believe on my Kobo Forma (?), when dragging the highlight, the cursor "gets stuck" on soft hyphens, so dragging stutters in the middle of a word, not following my finger as expected. And on many Android readers, when you highlight a soft-hyphenated word and try to dictionary lookup, it'll tell you "word is not found". Note: I forget exact details, and I haven't experienced this in a few years... because I make sure to purge all soft hyphens from all ebooks I load up. But the horrifying memories are still burned into my brain... ![]() Last edited by Tex2002ans; 11-24-2020 at 03:39 PM. |
||||
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Quote:
![]() But, besides, is it possible to edit a) Kobo's hyphenation dictionary, b) JSWolf's hyphenation dictionary, for example, and how can I do it? With Notepad++? Last edited by Leonatus; 11-25-2020 at 11:07 AM. |
|
![]() |
![]() |
![]() |
#11 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
![]() Quote:
I believe Kobo uses a slightly different hyphenation format (OpenOffice/LibreOffice?) than normal patterns (TeX), plus you have to do some minor tweaks to get it to work on Kobo. I don't know details though. Last edited by Tex2002ans; 11-25-2020 at 12:40 PM. |
||
![]() |
![]() |
![]() |
#12 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
Ok. Thank you! I'll see.
|
![]() |
![]() |
![]() |
#13 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,145
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
These hyphenation dictionaries are from OpenOffice/LibreOffice. And yes they have been edited but only slightly to add in the left/right hyphenation instructions.
|
![]() |
![]() |
![]() |
#14 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,050
Karma: 11391181
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
|
that was quick! Thank you!
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Plugin to format book based on syllables | leoaraujo79 | Introduce Yourself | 0 | 09-12-2019 06:34 AM |
Find and Rename words in 10 Files simultaneously with Sigil | iki1lu4fun | Sigil | 7 | 01-24-2015 01:17 AM |
Help with Regex - find groups of words in uppercase | Hoods7070 | Sigil | 3 | 06-11-2013 08:41 AM |
Limit find to whole words. | aerosol_grey | Sigil | 10 | 03-16-2012 10:50 AM |