View Single Post
Old 06-29-2010, 03:26 PM   #1
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Sigil, UTF-8 and the emdash

I have just run across a strange occurance with Sigil. I loaded a HTML file to Sigil and noticed all of the emdashs (—) had disappeared. The original file had 250 occurances of the emdash.

I went back to the original HTML file and changed the character set from Windows 1252: Western European to UTF-8 (which Sigil uses) and all of the emdashes disappeared. I then went back to Windows 1252: Western European and replaced all (—) with amp#8212; , converted back to UTF-8 and all emdashes re-appeared. I then loaded to Sigil and all emdashes were present. This appears to be a UTF-8 problem.

As a pre-process, all HTML files will have to be edited prior to loading to Sigil unless someone has come up with a work-around. Are there any other characters to watch for?
Curiouser and curiouser!!

Last edited by crutledge; 06-29-2010 at 03:28 PM.
crutledge is offline   Reply With Quote