Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 03-16-2014, 05:25 AM   #1
stweb
Junior Member
stweb began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2014
Location: NSW, Australia
Device: Nook Simple Touch
Huge text encoding problem

Hello,

I have this troublesome epub file.
- It renders fine on Moonreader on my Android tablet.
- It renders with ? replacing special characters (e.g. quotation marks, currency symbols, etc.) on my Nook Simple Touch.
- The special characters are largely missing when viewed in Calibre (latest version).

I think the encoding of file is wrong somehow.
- When I extract the html and open in Notepad++ special characters are missing, like in Calibre.
- When I change the character set to Windows-1252 (still in UTF-8) or similar, the missing characters appear but are "mangled".

Can this problem be fixed?
-I couldn't do it in Calibre using the conversion or edit tools or Modify ePub plugin.
-I did try find and replace in Notepad++, but the characters comehow came back as ? when I imported the html into Calibre.

My next course of action could be to root the NST and use a different reader.
stweb is offline   Reply With Quote
Old 03-16-2014, 06:22 AM   #2
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,508
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
You probably just need to fix the html files so that they declare the character set they're encoded with, whatever that is.
pdurrant is offline   Reply With Quote
Advert
Old 03-16-2014, 07:57 AM   #3
stweb
Junior Member
stweb began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2014
Location: NSW, Australia
Device: Nook Simple Touch
I found a solution:
- Installed Sigil.
- Opened the epub in Sigil - the special characters were visible in the code editor view.
- Saved as epub, but the curly quotes still did not display.
- Returned to Sigil and used find and replace with the proper curly quotes and em dashes. It seems no intervention was required for the other special characters.
- Saved as epub and it looked OK in Calibre.
- Edited the epub in Calibre to my liking.
- Side-loaded to Nook. Done.

Very annoying that Calibre does not display the unusual encoding - if we can't see it then we can't use find and replace. The html says utf-8 but then there are dozens of character sets - confused.

I will be more selective in downloading epubs in the future - 2/3 of my first three downloads were badly authored.
stweb is offline   Reply With Quote
Old 03-16-2014, 08:10 AM   #4
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,508
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Quote:
Originally Posted by stweb View Post
I found a solution:
- Installed Sigil.
- Opened the epub in Sigil - the special characters were visible in the code editor view.
- Saved as epub, but the curly quotes still did not display.
- Returned to Sigil and used find and replace with the proper curly quotes and em dashes. It seems no intervention was required for the other special characters.
- Saved as epub and it looked OK in Calibre.
- Edited the epub in Calibre to my liking.
- Side-loaded to Nook. Done.

Very annoying that Calibre does not display the unusual encoding - if we can't see it then we can't use find and replace. The html says utf-8 but then there are dozens of character sets - confused.

I will be more selective in downloading epubs in the future - 2/3 of my first three downloads were badly authored.
It sounds like the quotation marks might not have been the standard quotation marks.

There aren't multiple character sets with UTF-8. That's rather the point. There are a *lot* of characters though.
pdurrant is offline   Reply With Quote
Old 03-16-2014, 08:17 AM   #5
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by stweb View Post
- Opened the epub in Sigil - the special characters were visible in the code editor view.
If the special characters were visible in Sigil, it's most likely a font issue. If you've saved a copy of the ePub, try to figure out what font was assigned to body text and/or paragraphs. (Usually Calibre adds a <span class="CalibreX">...</span>.)
If a font was assigned, delete the line with the definition from the stylesheet and embed a free font with support for curly quotes, for example Charis SIL.
Install the font on your system (or use an installed font), open Calibre and select Common Option > Look & Feel > Embed font family and do an epub to epub conversion. This'll embed the font.
Doitsu is offline   Reply With Quote
Advert
Reply

Tags
calibre, encoding, epub


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with character encoding thesuker Calibre 2 11-09-2012 10:11 PM
Encoding problem Mixx Recipes 0 07-30-2011 05:27 AM
Encoding problem in translation sbring OpenInkpot 1 04-29-2010 05:58 PM
Need help with text encoding daesdaemar Workshop 12 12-31-2008 11:54 AM
iLiad A filename encoding problem. ericshliao iRex Developer's Corner 1 02-10-2008 07:25 PM


All times are GMT -4. The time now is 04:00 PM.


MobileRead.com is a privately owned, operated and funded community.