12-12-2018, 11:06 AM | #1 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
strange invisible? character - how to find it?
so I recently posted about an issue with calibre content server not giving me recent tweak of a specific book.
what led to this is that in this book, after any scene break, I am seeing a strange artifact in Moon reader. I see the letters OBJ surrounded by a square of dashes forming a border. I see nothing at all in calibre or sigil editors to account for this but when I view the book in adobe digital editions, I see an X inside a square as the strange character. messing with scene breaks was a try at removing stuff that might cause it but no joy so I have an "invisible" character at those points in the epub, now how can I edit it out ? invisible in sigil, invisible in calibre editor, visible in ADE, visible in notepad ++, visible in Moon reader NB if i paste it into MS word, I see the same artifact as in Moon reader i.e. OBJ inside of a border in calibre editor I see e.g. <p class="x03-co-body-text">The ... with nothing untoward about that class but when I paste a snippet into notepad++ I see the "invisible" character as a thinck horizontal dash with dotted lines above and below. the strange character is after the > and before the T in the above example so there is definitely something there, but how can I edit it away if I cant see it in the editor. i could probably write some regex to work on the 1st character after that specific x03 class as that's always where it happens, But I am curious as to what it is and why it is there ?  |
12-12-2018, 11:40 AM | #2 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I was under the impression that calibre's editor had some sort of visual indication that invisible characters were present. A yellow color or something?
It's probably a soft hyphen or a zero-width-joiner, or the like. |
Advert | |
|
12-12-2018, 11:51 AM | #3 | |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
Quote:
note also, this is the first characxter of the 1st sentence after each scene break - not somewhere you'd put a hyphen of any kind ?. the actual structure, before I ripped it all out was a <HR class = "transition" > adding a thin horizontal line the a scene break ornament picture thingy -in <img tags then onto the next scene. i can live with it as it, but i obstinately still want to identify it , then zap it via regex, if possible. IN SIGIL - I THINK find <p class="x03-co-body-text">[^A-Z] replace <p class="x03-co-body-text"> will zap it but then I will never know what it was Last edited by stumped; 12-12-2018 at 11:57 AM. |
|
12-12-2018, 11:53 AM | #4 |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
simply move your cursor around it, you will see it in the bottom right corner of the editor screen as you do
|
12-12-2018, 01:44 PM | #5 |
Well trained by Cats
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
Advert | |
|
12-12-2018, 02:00 PM | #6 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
Thanks. Library pc is off for the night so I will have another look in the editor in the morning.
Meanwhile, idly googling the issue from tablet,, I wonder if it is some format control character used in word and somehow left in. Something that might be desirable for starting a new paragraph after a scene break. I found a website what said insert your invisible character here... And got it out of moon reader and onto the clipboard but no visible result Another link suggested it might be a 144 control code, as they can appear as an X in a box. Staying with the calibre editor, as a learning exercise. Let's say I do spot it in the right hand corner. Can I then get it into regex find command , in the editor , to do a find all, replace with nothing.... Regex? Last edited by stumped; 12-12-2018 at 02:02 PM. |
12-12-2018, 02:22 PM | #7 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I knew it showed the char name next to the cursor somewhere, but I could have sworn there was some sort of yellow(?) background that was a visual indicator of where an invisible character was lurking. But maybe that was just talked about RE someone's wishlist of features, or something. Who knows?
|
12-12-2018, 02:26 PM | #8 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
In sigil the only clue was no regex matches when I ran find on the opening style followed by a capital letters which is what was visible. So something between the close bracket of the style and the 1st visible letter was affecting regex find.
|
12-12-2018, 03:39 PM | #9 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Yes. Without the actual character present to copy, you'd need to use some form of hex character code with regex to find it. In Sigil searching for \x{200D} would find a zero-width joiner character (hex code 0x200D).
x\{00AD} would find soft-hyphens x\{202F} - narrow non-breaking space x\{FEFF} - zero-width non-breaking space Can't remember if calibre's regex engine uses the same notation or not. Could be something like u\{00AD} or u\00AD |
12-12-2018, 04:01 PM | #10 | |
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Characters such as nbsp, em-dash etc are highlighted with the "Special Character" colour - which is 'yellowish' by default. The Tools->Reports->Characters tool can be useful in situations like this. BR Last edited by BetterRed; 12-12-2018 at 04:07 PM. |
|
12-12-2018, 06:29 PM | #11 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
That was it! Thanks.
It makes sense that the feature doesn't extend to special invisible characters now that I think about it. Pretty hard to highlight something you can't see. |
12-12-2018, 07:52 PM | #12 |
null operator (he/him)
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Especially when the something you can't see gets trumped by the something you can see ==>> https://www.mobileread.com/forums/sh...d.php?t=313380
BR |
12-13-2018, 01:58 AM | #13 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
ok - after moving the cursor back and forth I got the invisible character named in the editor.
I had to put cursor on the 1st visible letter , then back arrow to get it to show at bottom right the description then says Object Replacement Character so I am none the wiser as to how/why it got to be there, but I reckon I can zap it with my previous planned regex, using not A-Z to track it down, I probably need not A-Z or " to cover all start of sentence possibilities here is a wiki entry: for object replacement character https://en.wiktionary.org/wiki/%EF%BF%BC (computing) The object replacement character, sometimes used to represent an embedded object in a document when it is converted to plain text. |
12-13-2018, 06:17 AM | #14 |
Grand Sorcerer
Posts: 27,549
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Find: \uFFFC
Replace with: "" |
12-13-2018, 06:42 AM | #15 |
Wizard
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
|
thanks, but I have done it now, using a not A-Z workaround. looks OK
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How do you find and delete a character? | Ken Maltby | Sigil | 5 | 01-15-2014 02:01 AM |
finding an invisible character | cybmole | Sigil | 2 | 07-03-2011 06:41 AM |
Calibre Library Strange character in text | mitch13 | Library Management | 2 | 04-05-2011 02:46 AM |
Strange behaviour of TOC for one character | paulpeer | Calibre | 6 | 03-07-2010 12:03 PM |
Strange  character appearing throughout e-book text | mag1 | ePub | 21 | 02-01-2010 07:01 AM |