|12-24-2011, 03:52 PM||#1|
Join Date: Dec 2011
mobi conversion loses latin-extended-additional unicode characters
Greetings to all,
I've been going around and around trying to get Calibre to convert latin-extended-addtional characters from either an html import or a Sigil produced epub input to mobi output. It looks fine, of course, in Calibre's LRF reader, but when I open it up in the Kindle4pc all the "dot-under" and "dot-over" consonants are boxes. Interestingly, I can convert the same epub from Sigil with KindleGen, and the diacritics are fine. The rest of the formatting is monstrous, of course, which is why I'd really like to get over this hump in Calibre. I also don't think Calibre's truncating the unicode upon import, because I converted to .mobi in Calibre, then zipped up the debug output, changed the extension to .epub, and converted to mobi in KindleGen, and the diacritics rendered just fine. The formatting was also a better -- somewhere in between the pure KindleGen and straight Calibre mobi conversion.
I've also: 1) looked at the encoding in the html for "utf-8" declarations: good; 2) chosen the input encoding in "look & feel" to cp1252, utf-8, latin1 to see if that made a difference: none; 3) tried to set the input encoding to utf-8 from the command line: no change; 4) compared the html, toc.ncx, content.opf(sp?) in the Calibre mobi to the one passed through the KindleGen afterwards: no difference. 5) Modified the htmltozip plugin to specify utf-8, and then imported from html: no difference.
As a side bit of interest, I can embed fonts that have the latin-extended-additional characters into an epub with Sigil, and it works great on ADE. If try to embed them with Calibre, or epub to epub convert with Calibre, the same loss occurs, despite the fact that I can see the embedded font is showing up in the reader!
So very curious. I'd be grateful to know if anybody has any ideas what's going on. Best wishes.
|calibre, conversion, diacritics, latin-extended-additional, mobi|
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Table Generation loses borders (.mobi conversion)||SoftwareManiac||Conversion||5||05-27-2011 12:32 PM|
|What happened to my extended characters?||ChrisI||Sigil||8||05-16-2010 08:31 AM|
|Unicode characters OK in text but wrong in TOC||paulpeer||ePub||8||01-15-2010 07:17 PM|
|Latin Extended-A Font Support?||.graham||Sony Reader||2||05-09-2009 06:45 PM|
|Extended characters||jbenny||Upload Help||28||10-12-2007 11:03 AM|