![]() |
#1 |
Connoisseur
![]() Posts: 62
Karma: 10
Join Date: Jun 2012
Device: Kindle, iPad
|
Non-displayable characters
I recently downloaded a book with non-displayable characters in it. Kindle White shows the Unicode in a box. Unfortunately I couldn't see these codes in the editor and had to use Sigil to fix the text.
The Calibre editor seemed to have ignored the non-displayable characters in both panes. Is there an option I should be setting so that I can see these letters so that I am able to replace them. They are typlically the left and right-quotes, apostrophe and em-dash. Sigil displays them as a question mark in a box. Many thanks |
![]() |
![]() |
![]() |
#2 |
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,895
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
The calibre editor marks them by highlighting them in yellow. If you put your cursor to the right of the character calibre will display what the character is in the lower right hand corner. Although why you cannot see the characters you cited is beyond me.
Last edited by DoctorOhh; 04-09-2014 at 06:58 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,942
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
There will be 2 font sets involved: 1) the font used for Code view 2) the embedded font for book view (I think that this one only applies if there is a @font setting), otherwise: the font-family dishes what is available |
|
![]() |
![]() |
![]() |
#4 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Both editor and preview fonts are set in Preferences. These particular characters should show. Change the fonts shown there to something else and see how it looks.
|
![]() |
![]() |
![]() |
#5 |
Connoisseur
![]() Posts: 62
Karma: 10
Join Date: Jun 2012
Device: Kindle, iPad
|
I think I may have mislead you. In their normal form, the quotation marks, apostrophes and em dash characters all show. However, when I downloaded the book, the left quotation was shown as Unicode 0082 and the right as 0083. These are non-displayable characters in any font, but were consistently used throughout the book to represent quotation marks.
Therefore, I need to replace these 0082 and 0083 characters with “ and ” respectively. If I use Calibre Editor to see the downloaded text, none of the 0082 or 0083 characters are visible. I have just determined that a space is displayed, which one can copy and search for. In Sigil, they are displayed as question-mark encased in a square. It is these non-displayable characters for which I am seeking advice. Sorry for the confusion |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
You can enter the unicode in regex and find them that way. You'd need the unicode number to search for them. There are many tables on the web, and maybe even on this forum which will tell you...Actually, in the special characters of calibre, it will tell you what the number is.
I think you might be able to use the special characters function of the editor to enter them in find. |
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Isn't it possible to do a search & replace?
Search for: \u0082 Replace with: “ For seeing the characters in the code view you maybe need to change the preferences for the editor font family to a font what includes the characters you need (for Windows something like Arial Unicode MS). I use this font, because it has the most extensive font set. |
![]() |
![]() |
![]() |
#8 |
Connoisseur
![]() Posts: 62
Karma: 10
Join Date: Jun 2012
Device: Kindle, iPad
|
Many thanks to all for the editing suggestions. The problem though is not editing, but seeing the non-displayable characters even if only in code view. They are not visible. They show as spaces.
I tried Divingduck's font-change suggestion - even though I run on a Mac. Unfortunately the existence of the characters is still not visible. Perhaps Kovid might comment here? |
![]() |
![]() |
![]() |
#9 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,624
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi
Quote:
This is what I found. https://www.mobileread.com/forums/sho...4&postcount=11 The only - clumsy - way today to make this invisible character visible in Code view would be to search and replace it by the entity code &# 8239; (without space). Only then would the invisible" nnbsp be displayed by its entity code name &# 8239; in code view in the calibre editor. But you would have also to insert the DOCTYPE required for entities... I do not think this would be a good solution. ![]() Last edited by roger64; 04-11-2014 at 09:15 PM. |
|
![]() |
![]() |
![]() |
#10 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
U+0082 is not an opening or a closing quote of any kind. It is an ascii control code (Break permitted here). Being invisible is the correct behavior for it. There are almost no fonts that render it. For a list see here: http://www.fileformat.info/info/unic...r/82/index.htm
If you have it in your book, it is almost certainly the result of incorrect character decoding. Simply use search and replace to replace it with the correct character. |
![]() |
![]() |
![]() |
#11 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
Ups, sorry. I had not looked to the code. As Kovid is suggesting, replacing is the best way to solve it.
@Kovid, Only an idea. I know your solution is correct, but is it maybe possible, to render not included Unicode characters in code view window with a replacement placeholder? Maybe in combination with a switch to use it only if needed. |
![]() |
![]() |
![]() |
#12 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@Divingduck: That is not possible to do without re-implementing the Qt text rendering stack, which is waaaay too much work.
|
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
I think, it is better to forget my idea. There are much more important things what are waiting for your implementation
![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How many characters will you use? | frahse | Writers' Corner | 21 | 09-21-2012 10:44 AM |
¿Convert unicode decomposed characters to unique/normal characters? | JohnQwerty | Calibre | 3 | 04-05-2012 12:08 PM |
All characters are fictitious ... | mr ploppy | Writers' Corner | 8 | 07-19-2011 05:04 AM |
Classic using characters on NOOK | nmed | Barnes & Noble NOOK | 1 | 08-02-2010 06:09 PM |
Characters missing | orailean | Calibre | 2 | 07-23-2008 07:02 AM |