Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-24-2020, 10:56 AM   #1
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Find whole words (and not only syllables)

Sigil performs hyphenation in the editor, that's a pretty feature.
But it seems to affect the "Find & Replace" functionality, as recently I can only look for syllables, not for whole words.
For example: If I search for "Ratte", the result is: "expression not found", but if I enter: "Rat", it will find me the syllable, which is a little inconvenient.

Is there a setting for this?
Leonatus is offline   Reply With Quote
Old 11-24-2020, 11:31 AM   #2
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Leonatus View Post
Sigil performs hyphenation in the editor, that's a pretty feature.
???

Are you sure you don't have Soft Hyphens hiding throughout your text?

Soft Hyphens are an invisible character that only turns into a hyphen when it reaches the end of a line.

A telltale sign of Soft Hyphens is when you get red squigglies on words that are spelled correctly... and/or when your search gets broken.

See some of my posts on this (explaining why Soft Hyphens are awful + problems that may occur):

Quote:
Originally Posted by Leonatus View Post
For example: If I search for "Ratte", the result is: "expression not found", but if I enter: "Rat", it will find me the syllable, which is a little inconvenient.
Did you accidentally run Calibre's "Hyphenate This!" plugin?

What you want to do is Find/Replace for the Soft Hyphen character, and remove them all.

One easy way to do this is to go into Sigil:

1. Tools > Reports > Characters in HTML Files

If you scroll through the list, you might see:

Code:
Character: <----- (It looks like a hyphen, but it's actually an invisible character.)
Decimal: 173
Hexadecimal: AD
Entity Name: shy
Entity Description: soft hyphen
Double click on that row, and Sigil should insert a:

\ + Soft Hyphen

into the Search box.

2. Make sure the Replace: box is completely blank.

3. Change Mode: to "Regex".

4. Press Count All to see if there are any hits.

5. Press Replace All.

That should wipe all Soft Hyphens out of your book. Now you should have no problem with your normal searches.

Last edited by Tex2002ans; 11-24-2020 at 11:41 AM.
Tex2002ans is offline   Reply With Quote
Old 11-24-2020, 11:46 AM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Soft hyphens would be my guess. Hate those things.
DiapDealer is offline   Reply With Quote
Old 11-24-2020, 11:59 AM   #4
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,636
Karma: 5433388
Join Date: Nov 2009
Device: many
Spellchecking should now handle soft hyphens without barfing. Search and replace will not unless you use regex to deal with them. Another way to just see the soft hyphens is to add the soft hyphen entity (named or numeric as appropriate to your epub version) to Sigil's PreserveEntities setting.

That said, I urge you to remove the soft hyphens for general work. You can add them back after the book is polished and in near final form using calibre if you really want them.
KevinH is offline   Reply With Quote
Old 11-24-2020, 01:12 PM   #5
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Leonatus View Post
Sigil performs hyphenation in the editor, that's a pretty feature.
As other have already pointed out, most likely your source text contained soft hyphens that you search for with the following regular expression:

Code:
\x{00AD}
Quote:
Originally Posted by KevinH View Post
Spellchecking should now handle soft hyphens without barfing.
Based on a quick test, Sigil spellcheck works fine with words that contain discretionary hyphens. (It works with Sigil 1.3.x and higher; it does not work with Sigil 0.9.x.)

Last edited by Doitsu; 11-24-2020 at 01:21 PM.
Doitsu is offline   Reply With Quote
Old 11-24-2020, 01:43 PM   #6
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Yes, you are all completely right! thank you for your help!
I wouldn't have thought that the epub contained soft hyphens, for I had built the epub myself, and, of course, without hyphens. But as the book lies for a considerable time in my file system, it might well be that once upon a time I had re-saved it from Calibre's file location (with soft hyphens). I might just have forgotten.
Anyway: @DiapDealer: I'm comprehensive for anglophone users to "hate these things". But the german language is different: Imagine a word like "Dampfschifffahrtsgesellschaft" - my finger nails are warping at writing this - without hyphenation on an e-bool reader! That's to ugly by far. Thus, I estimate the HypenateThis! plugin very much, as its hyphenation results for the german language are in about 85 % correct.
But your hints to detect soft hyphens in Sigil are really valuable to me in the future, as this issue occurs not so rarely.
Thank you again!
Leonatus is offline   Reply With Quote
Old 11-24-2020, 01:52 PM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Leonatus View Post
Yes, you are all completely right! thank you for your help!
I wouldn't have thought that the epub contained soft hyphens, for I had built the epub myself, and, of course, without hyphens. But as the book lies for a considerable time in my file system, it might well be that once upon a time I had re-saved it from Calibre's file location (with soft hyphens). I might just have forgotten.
Anyway: @DiapDealer: I'm comprehensive for anglophone users to "hate these things". But the german language is different: Imagine a word like "Dampfschifffahrtsgesellschaft" - my finger nails are warping at writing this - without hyphenation on an e-bool reader! That's to ugly by far. Thus, I estimate the HypenateThis! plugin very much, as its hyphenation results for the german language are in about 85 % correct.
But your hints to detect soft hyphens in Sigil are really valuable to me in the future, as this issue occurs not so rarely.
Thank you again!
Oh, don't get me wrong. I know there's a valid use for them. But way too many English speaking folk choose to litter their text with them in a hackish attempt to simulate hyphenation in rendering engines that don't natively support it. THAT'S what I hate. The pollution of markup with invisible hyphens in every single word over one syllable.

People should buy readers that natively support hyphenation if they read content that would suffer without it (and it matters greatly to them). Never been a big fan of content providers deciding for readers what should be important to them.

Last edited by DiapDealer; 11-24-2020 at 02:08 PM.
DiapDealer is offline   Reply With Quote
Old 11-24-2020, 02:41 PM   #8
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Quote:
Originally Posted by DiapDealer View Post
People should buy readers that natively support hyphenation if they read content that would suffer without it (and it matters greatly to them). Never been a big fan of content providers deciding for readers what should be important to them.
I own a Kobo, and in fact, Kobo has a built-in hyphenation, but for reasons that I ignore, this hyphenation in the german language is rather awful, which means it isn't correct in perhaps 30 % of the examples. This is a real matter for "good" reading.
Leonatus is offline   Reply With Quote
Old 11-24-2020, 03:11 PM   #9
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Leonatus View Post
Yes, you are all completely right! thank you for your help!


Quote:
Originally Posted by Leonatus View Post
But your hints to detect soft hyphens in Sigil are really valuable to me in the future, as this issue occurs not so rarely.
If a book ever pops up with "Find and Replace isn't working", my mind instantly jumps to soft hyphens, and that's usually the problem 100% of the time!

Side Note: Another potential "weird character" issue is substituting Latin characters with Cyrillic ones:

C (Latin)
С (Cyrillic letter)

It's mostly used in Phishing attacks:

https://krebsonsecurity.com/2018/03/...ual-confusion/

and unscrupulous people who try to sell you dirt cheap "writing" (on sites like Fiverr) by copying already written works and swapping characters that visually look similar... trying to get around "plagiarism checks".

Again, red squigglies, "broken search", and/or Sigil's Character Reports would give it away.

Quote:
Originally Posted by DiapDealer View Post
Soft hyphens would be my guess. Hate those things.
Me too. Awful, awful things!

Quote:
Originally Posted by DiapDealer View Post
But way too many English speaking folk choose to litter their text with them in a hackish attempt to simulate hyphenation in rendering engines that don't natively support it. THAT'S what I hate. The pollution of markup with invisible hyphens in every single word over one syllable.

People should buy readers that natively support hyphenation if they read content that would suffer without it (and it matters greatly to them).


And with devices like Kobo, you can insert your own hyphenation dictionary if needed, and then poof, you get properly hyphenated words without all the downsides!

Quote:
Originally Posted by Leonatus View Post
I own a Kobo, and in fact, Kobo has a built-in hyphenation, but for reasons that I ignore, this hyphenation in the german language is rather awful, which means it isn't correct in perhaps 30 % of the examples. This is a real matter for "good" reading.
You may want to check out JSWolf's "Better Hyphenation" thread.

He recently included Kobo hyphenation dictionaries for the German (DE) language.

I believe some of the default languages use extremely high left/right numbers (sometimes as high as 5), which means words might not even get hyphenated unless 10+ characters long!

Hyphenation Note: Different languages require different Left/Right minimums for proper typography (a trusted list can be found at Hyphenation.org):

2/3 (English)
2/2 (German)
2/2 (Spanish)
1/2 (Armenian)

Depending on the language, they'll use 1-3.

But 5??? Preposterous. Don't know what Kobo was thinking with those.

Quote:
Originally Posted by DiapDealer View Post
Never been a big fan of content providers deciding for readers what should be important to them.
Exactly. Plus, as I've stated in those topics before, soft hyphens cause so much collateral damage across the board.

Breaking highlighting and dictionary support being two of the biggest that have bothered me lately:

I believe on my Kobo Forma (?), when dragging the highlight, the cursor "gets stuck" on soft hyphens, so dragging stutters in the middle of a word, not following my finger as expected.

And on many Android readers, when you highlight a soft-hyphenated word and try to dictionary lookup, it'll tell you "word is not found".

Note: I forget exact details, and I haven't experienced this in a few years... because I make sure to purge all soft hyphens from all ebooks I load up.

But the horrifying memories are still burned into my brain...

Last edited by Tex2002ans; 11-24-2020 at 03:39 PM.
Tex2002ans is offline   Reply With Quote
Old 11-25-2020, 11:05 AM   #10
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Quote:
Originally Posted by Tex2002ans View Post
You may want to check out JSWolf's "Better Hyphenation" thread.

He recently included Kobo hyphenation dictionaries for the German (DE) language.
JSWolf's addition of the german hyphenation dictionary was due to my request.

But, besides, is it possible to edit
a) Kobo's hyphenation dictionary,
b) JSWolf's hyphenation dictionary, for example,
and how can I do it? With Notepad++?

Last edited by Leonatus; 11-25-2020 at 11:07 AM.
Leonatus is offline   Reply With Quote
Old 11-25-2020, 12:37 PM   #11
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Leonatus View Post
JSWolf's addition of the german hyphenation dictionary was due to my request.


Quote:
Originally Posted by Leonatus View Post
But, besides, is it possible to edit
a) Kobo's hyphenation dictionary,
b) JSWolf's hyphenation dictionary, for example,
and how can I do it? With Notepad++?
Ask JSWolf. He knows all the details (especially since he generates them!), or instructions might already be mentioned in his topic.

I believe Kobo uses a slightly different hyphenation format (OpenOffice/LibreOffice?) than normal patterns (TeX), plus you have to do some minor tweaks to get it to work on Kobo.

I don't know details though.

Last edited by Tex2002ans; 11-25-2020 at 12:40 PM.
Tex2002ans is offline   Reply With Quote
Old 11-25-2020, 01:43 PM   #12
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
Ok. Thank you! I'll see.
Leonatus is offline   Reply With Quote
Old 11-25-2020, 05:00 PM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,897
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
These hyphenation dictionaries are from OpenOffice/LibreOffice. And yes they have been edited but only slightly to add in the left/right hyphenation instructions.
JSWolf is offline   Reply With Quote
Old 11-26-2020, 06:17 AM   #14
Leonatus
Wizard
Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.Leonatus ought to be getting tired of karma fortunes by now.
 
Leonatus's Avatar
 
Posts: 1,023
Karma: 10963125
Join Date: Mar 2013
Location: Guben, Brandenburg, Germany
Device: Kobo Clara 2E, Tolino Shine 3
that was quick! Thank you!
Leonatus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Plugin to format book based on syllables leoaraujo79 Introduce Yourself 0 09-12-2019 06:34 AM
Find and Rename words in 10 Files simultaneously with Sigil iki1lu4fun Sigil 7 01-24-2015 01:17 AM
Help with Regex - find groups of words in uppercase Hoods7070 Sigil 3 06-11-2013 08:41 AM
Limit find to whole words. aerosol_grey Sigil 10 03-16-2012 10:50 AM


All times are GMT -4. The time now is 05:03 PM.


MobileRead.com is a privately owned, operated and funded community.