View Single Post
Old 08-26-2010, 09:28 AM   #6
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by MacEachaidh View Post
I have several epub files created with standard western font encoding, that I have opened in Sigil to embed cover graphics. After saving them, I find that typographic quotes have been converted to ... I'm not sure what. Unicode?

For instance, a typographic apostrophe now appears as ’, an open double-quote as “ and a close double-quote as �?.
Quote:
Originally Posted by Dave_S View Post
FWIW, I usually see that kind of mixup when encoding="xxxx" is in the beginning of of the XHTML, but the text is actually encoded as "yyyy", where xxxx and yyyy are two different text encoding schemes. That usually seems to happen when MS tools are used, which generate win-1252 encoding which is badly incompatible with utf-8 for special characters.
Dave_S has it right. Your encodings are mixed up. The file is probably declaring the use of one encoding, but actually using another. Or it's declaring the use of two encodings (which is impossible).

Sigil looks at the encoding declared and converts the bytestream from that encoding into UTF-16. As long as the file is truthful about it's encoding , this works great.

Quote:
Originally Posted by Jellby View Post
A trick: when I need to differentiate between single quotes and apostrophes, I use ‘ ’ for the quotes and & #8217; (without space) for the apostrophe. The character is exactly the same, but that allows for better search and replace in the future. I don't know if Sigil would keep this, though.
It should.
Valloric is offline   Reply With Quote