Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 07-25-2013, 10:34 PM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,090
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Removing Soft hyphens

https://www.mobileread.com/forums/showthread.php?t=77992

I can see the C2 AD (194 and 173) with my hex editor, but as others have pointed out, they're invisible in Sigil

I tried copy/paste the characters, but nothing worked.

The post above is old and 7.2 is out, so maybe things have changed???

Is there any way to strip them out of Sigil? RegEx maybe?

Paul
phossler is offline   Reply With Quote
Old 07-25-2013, 11:11 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,240
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by phossler View Post
https://www.mobileread.com/forums/showthread.php?t=77992

I can see the C2 AD (194 and 173) with my hex editor, but as others have pointed out, they're invisible in Sigil

I tried copy/paste the characters, but nothing worked.

The post above is old and 7.2 is out, so maybe things have changed???

Is there any way to strip them out of Sigil? RegEx maybe?

Paul
When you pasted, did you try and escape the character?
theducks is offline   Reply With Quote
Advert
Old 07-25-2013, 11:36 PM   #3
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,090
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
When you pasted, did you try and escape the character?
No. I just did and it works.

That's one for my saved searchs

Still, it would be nice (IMHO) if Sigil had a 'Revel Hidden Codes' View option that S&R would work in. Poking around in hex, it looks like there's some more stuff to investigate


Paul

Last edited by phossler; 07-25-2013 at 11:39 PM.
phossler is offline   Reply With Quote
Old 07-26-2013, 02:04 AM   #4
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,762
Karma: 24088559
Join Date: Dec 2010
Device: Kindle PW2
BTW, the Calibre Hyphenate This! plug-in can automatically remove all soft hyphens.
Doitsu is offline   Reply With Quote
Old 07-26-2013, 07:07 AM   #5
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Quote:
Originally Posted by phossler View Post

Still, it would be nice (IMHO) if Sigil had a 'Revel Hidden Codes' View option that S&R would work in. Poking around in hex, it looks like there's some more stuff to investigate

Agreed. But it may not be available in webkit. However, perhaps they could build a search function for these types of characters which would allow us to do something with them. Hidden spaces which throw things out of line have been my issue, though mostly with imported html.
mrmikel is offline   Reply With Quote
Advert
Old 07-26-2013, 01:05 PM   #6
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,090
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@Doitsu -- thanks, but a lot of time Calibre will add a lot of CSS that I don't want. I will follow up. Maybe run the html into Calibre, convert to epub, and then the plug in???

@mrmike -- The biggest problem is that F&R seems to want the character as text. This means that I have to locate and identify the troublesome character, use CharMap to copy it, paste in into a Sigil Find (remembering to escape it -- thanks 'theducks'), etc.

I knew it was 173, so I did try the \ and then alt+numpad 0173 in the Find, but didn't work. But CharMap works if I know what I'm looking for

Now that I have it as saved search it will be easier. I hope that as I find more things like this, I can just keeping addeing them to my 'Delete Bad Char' saved search

I couldn't figure out why spell check had 100+ occurances of just 'ed' and 'ing' flagged in things like 'walked' and 'walking'.


Paul
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	289
Size:	69.8 KB
ID:	108604  
phossler is offline   Reply With Quote
Old 07-26-2013, 01:47 PM   #7
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
If you know a location where it is in Sigil, you can actually select and copy it. Click on the character next to it and press shift+arrow in the direction you want to select. If you have the right character, the cursor will not move although you pressed the arrow. Copy and paste in the S&R window.
Toxaris is offline   Reply With Quote
Old 07-26-2013, 02:01 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,850
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Just use the \x{FFFF} method provided by PCRE Regex to search for unicode code points. Replace FFFF with hexdecimal representation of the unicode code point you're wanting. In this case 00ad (or just ad)

Using regex, search for \x{00AD} (or \x{ad) and replace with nothing to remove soft-hyphens.

Last edited by DiapDealer; 07-26-2013 at 02:04 PM.
DiapDealer is online now   Reply With Quote
Old 07-26-2013, 03:52 PM   #9
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,090
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@Toxaris and DiapDealer -- thanks !!

Both very useful tips

The \x{00AD} is MUCH easer to see

So if I wanted to follow this theme then could I include even more characters in my SavedSearch:?
[\x{00AD}\x{2000}-\x{200D}]

where 2000 is En-Quad and 200D is Zero Width Joiner (what ever that is)

That would include Thin, Hair, and Zero Width spaces that I think mrmike mentioned

Paul
phossler is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre remove soft hyphens? zuli Calibre 3 11-08-2017 09:20 PM
Soft Hyphens wallcraft Workshop 29 06-12-2012 04:21 AM
Option for removing soft hyphens? WarnerYoung Calibre 1 05-24-2012 11:44 PM
Feature request: soft hyphens paulpeer Sigil 3 12-05-2009 01:43 PM
Calibre deletes soft Hyphens in Epub ? NASCARaddicted Calibre 4 09-20-2009 06:31 PM


All times are GMT -4. The time now is 09:37 AM.


MobileRead.com is a privately owned, operated and funded community.