03-16-2014, 05:25 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Mar 2014
Location: NSW, Australia
Device: Nook Simple Touch
|
Huge text encoding problem
Hello,
I have this troublesome epub file. - It renders fine on Moonreader on my Android tablet. - It renders with ? replacing special characters (e.g. quotation marks, currency symbols, etc.) on my Nook Simple Touch. - The special characters are largely missing when viewed in Calibre (latest version). I think the encoding of file is wrong somehow. - When I extract the html and open in Notepad++ special characters are missing, like in Calibre. - When I change the character set to Windows-1252 (still in UTF-8) or similar, the missing characters appear but are "mangled". Can this problem be fixed? -I couldn't do it in Calibre using the conversion or edit tools or Modify ePub plugin. -I did try find and replace in Notepad++, but the characters comehow came back as ? when I imported the html into Calibre. My next course of action could be to root the NST and use a different reader. |
03-16-2014, 06:22 AM | #2 |
The Grand Mouse 高貴的老鼠
Posts: 71,508
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
You probably just need to fix the html files so that they declare the character set they're encoded with, whatever that is.
|
Advert | |
|
03-16-2014, 07:57 AM | #3 |
Junior Member
Posts: 2
Karma: 10
Join Date: Mar 2014
Location: NSW, Australia
Device: Nook Simple Touch
|
I found a solution:
- Installed Sigil. - Opened the epub in Sigil - the special characters were visible in the code editor view. - Saved as epub, but the curly quotes still did not display. - Returned to Sigil and used find and replace with the proper curly quotes and em dashes. It seems no intervention was required for the other special characters. - Saved as epub and it looked OK in Calibre. - Edited the epub in Calibre to my liking. - Side-loaded to Nook. Done. Very annoying that Calibre does not display the unusual encoding - if we can't see it then we can't use find and replace. The html says utf-8 but then there are dozens of character sets - confused. I will be more selective in downloading epubs in the future - 2/3 of my first three downloads were badly authored. |
03-16-2014, 08:10 AM | #4 | |
The Grand Mouse 高貴的老鼠
Posts: 71,508
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
There aren't multiple character sets with UTF-8. That's rather the point. There are a *lot* of characters though. |
|
03-16-2014, 08:17 AM | #5 | |
Grand Sorcerer
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
If a font was assigned, delete the line with the definition from the stylesheet and embed a free font with support for curly quotes, for example Charis SIL. Install the font on your system (or use an installed font), open Calibre and select Common Option > Look & Feel > Embed font family and do an epub to epub conversion. This'll embed the font. |
|
Advert | |
|
Tags |
calibre, encoding, epub |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with character encoding | thesuker | Calibre | 2 | 11-09-2012 10:11 PM |
Encoding problem | Mixx | Recipes | 0 | 07-30-2011 05:27 AM |
Encoding problem in translation | sbring | OpenInkpot | 1 | 04-29-2010 05:58 PM |
Need help with text encoding | daesdaemar | Workshop | 12 | 12-31-2008 11:54 AM |
iLiad A filename encoding problem. | ericshliao | iRex Developer's Corner | 1 | 02-10-2008 07:25 PM |