|  07-25-2013, 10:34 PM | #1 | 
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | 
				
				Removing Soft hyphens
			 
			
			https://www.mobileread.com/forums/showthread.php?t=77992 I can see the C2 AD (194 and 173) with my hex editor, but as others have pointed out, they're invisible in Sigil I tried copy/paste the characters, but nothing worked. The post above is old and 7.2 is out, so maybe things have changed??? Is there any way to strip them out of Sigil? RegEx maybe? Paul | 
|   |   | 
|  07-25-2013, 11:11 PM | #2 | |
| Well trained by Cats            Posts: 31,241 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | Quote: 
 | |
|   |   | 
|  07-25-2013, 11:36 PM | #3 | |
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | Quote: 
    That's one for my saved searchs Still, it would be nice (IMHO) if Sigil had a 'Revel Hidden Codes' View option that S&R would work in. Poking around in hex, it looks like there's some more stuff to investigate Paul Last edited by phossler; 07-25-2013 at 11:39 PM. | |
|   |   | 
|  07-26-2013, 02:04 AM | #4 | 
| Grand Sorcerer            Posts: 5,763 Karma: 24088559 Join Date: Dec 2010 Device: Kindle PW2 | 
			
			BTW, the Calibre Hyphenate This! plug-in can automatically remove all soft hyphens.
		 | 
|   |   | 
|  07-26-2013, 07:07 AM | #5 | 
| Color me gone            Posts: 2,089 Karma: 1445295 Join Date: Apr 2008 Location: Central Oregon Coast Device: PRS-300 | 
			
			Agreed.  But it may not be available in webkit.  However, perhaps they could build a search function for these types of characters which would allow us to do something with them.  Hidden spaces which throw things out of line have been my issue, though mostly with imported html.
		 | 
|   |   | 
|  07-26-2013, 01:05 PM | #6 | 
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | 
			
			@Doitsu -- thanks, but a lot of time Calibre will add a lot of CSS that I don't want. I will follow up. Maybe run the html into Calibre, convert to epub, and then the plug in??? @mrmike -- The biggest problem is that F&R seems to want the character as text. This means that I have to locate and identify the troublesome character, use CharMap to copy it, paste in into a Sigil Find (remembering to escape it -- thanks 'theducks'), etc. I knew it was 173, so I did try the \ and then alt+numpad 0173 in the Find, but didn't work. But CharMap works if I know what I'm looking for Now that I have it as saved search it will be easier. I hope that as I find more things like this, I can just keeping addeing them to my 'Delete Bad Char' saved search I couldn't figure out why spell check had 100+ occurances of just 'ed' and 'ing' flagged in things like 'walked' and 'walking'. Paul | 
|   |   | 
|  07-26-2013, 01:47 PM | #7 | 
| Wizard            Posts: 4,520 Karma: 121692313 Join Date: Oct 2009 Location: Heemskerk, NL Device: PRS-T1, Kobo Touch, Kobo Aura | 
			
			If you know a location where it is in Sigil, you can actually select and copy it. Click on the character next to it and press shift+arrow in the direction you want to select. If you have the right character, the cursor will not move although you pressed the arrow. Copy and paste in the S&R window.
		 | 
|   |   | 
|  07-26-2013, 02:01 PM | #8 | 
| Grand Sorcerer            Posts: 28,874 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | 
			
			Just use the \x{FFFF} method provided by PCRE Regex to search for unicode code points. Replace FFFF with hexdecimal representation of the unicode code point you're wanting. In this case 00ad (or just ad) Using regex, search for \x{00AD} (or \x{ad) and replace with nothing to remove soft-hyphens. Last edited by DiapDealer; 07-26-2013 at 02:04 PM. | 
|   |   | 
|  07-26-2013, 03:52 PM | #9 | 
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | 
			
			@Toxaris and DiapDealer -- thanks !! Both very useful tips The \x{00AD} is MUCH easer to see So if I wanted to follow this theme then could I include even more characters in my SavedSearch:? [\x{00AD}\x{2000}-\x{200D}] where 2000 is En-Quad and 200D is Zero Width Joiner (what ever that is) That would include Thin, Hair, and Zero Width spaces that I think mrmike mentioned Paul | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Calibre remove soft hyphens? | zuli | Calibre | 3 | 11-08-2017 09:20 PM | 
| Soft Hyphens | wallcraft | Workshop | 29 | 06-12-2012 04:21 AM | 
| Option for removing soft hyphens? | WarnerYoung | Calibre | 1 | 05-24-2012 11:44 PM | 
| Feature request: soft hyphens | paulpeer | Sigil | 3 | 12-05-2009 01:43 PM | 
| Calibre deletes soft Hyphens in Epub ? | NASCARaddicted | Calibre | 4 | 09-20-2009 06:31 PM |