03-08-2010, 02:23 PM | #1 |
Zealot
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
|
Encoding declaration in OPF and TOC?
I've made a lot of books with Sigil and sometimes I import them in Calibre. After having done that, I see that often accented characters are shown in the wrong way (e.g. Chinese characters instead of Latin ones) in the TOC and in the meta data.
About the first problem (TOC) I wrote to the programmer of Calibre and his response was "Bug fixed: When decoding NCX toc files, if no encoding is declared and detection has less that 100% confidence, assume UTF-8." So I understand that the TOC should have an encoding declaration. Can this be added so that Sigil does that automatically? As I understand Sigil delivers perfect utf-8 but doesn't declare so. Also about the second problem (errors in the meta data) I wrote to Calibre, and the answer was similar: stick an encoding declaration in the OPF. Hence my similar question: Can Sigil add an encoding declaration to the OPF? Thanks! |
03-08-2010, 02:55 PM | #2 | ||
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
In plain English, UTF-8 is the default character encoding for XML. I thought everyone knew that. But I'll add the attribute, it can't hurt. EDIT: And here's the source. Just invert the negatives. Quote:
Last edited by Valloric; 03-08-2010 at 03:11 PM. |
||
03-08-2010, 03:18 PM | #3 | |
Zealot
Posts: 147
Karma: 56
Join Date: Dec 2009
Location: Antwerpen
Device: iPhone, Sony PRS-505, EPUBreader
|
Quote:
Thanks! It will save a lot of trouble in many cases. |
|
03-08-2010, 03:21 PM | #4 |
creator of calibre
Posts: 43,927
Karma: 22669820
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Oh if only everyone knew that and no one produced XML files encoded in encoding other than UTF-8 with no encoding declaration.
|
03-08-2010, 03:32 PM | #5 | |
Created Sigil, FlightCrew
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
But you really should fall back to the standard when byte stream fingerprinting isn't 100% sure of the encoding. |
|
03-08-2010, 03:48 PM | #8 |
creator of calibre
Posts: 43,927
Karma: 22669820
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The problem is that byte stream fingerprinting is almost never a hundred percent certain.
|
Tags |
declaration, encoding, epub, utf-8 |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Old Thread] calibre not creating content.opf or toc.ncx files during conversion | foxxywith2xs | Calibre | 7 | 12-16-2012 07:49 PM |
Proper Unicode Declaration | Fabe | Sigil | 9 | 10-13-2010 01:42 PM |
Namespace declaration | ChrisI | Sigil | 1 | 08-22-2010 06:02 AM |
Declaration of Independence | bill the smith | News | 140 | 10-02-2009 05:01 PM |
Making a TOC for LRFs? Issues with Calibre + LRF TOC editor not working | Magitek | LRF | 0 | 05-06-2009 01:25 PM |