View Single Post
Old 12-23-2024, 06:17 PM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,859
Karma: 6120478
Join Date: Nov 2009
Device: many
If the html extension writes the epub file and adds the meta charset="iso-8859-1" that means the text is encoded that way. It must be read in as iso-8859-1 and recoded to utf-8 properly. Sigil can actually detect that and properly encode it as utf-8 but not if you change that charset meta info or improperly add the xml header saying it is utf-8 when it is not.

A better technique is to use a python script to read each html file in as iso-8859-1 (sometimes called latin-1) and recode it and write it out as utf-8.
KevinH is offline   Reply With Quote