MobileRead Forums - View Single Post - Making sense of faulty HTML to EPUB conversion

Shohreh · 12-23-2024, 06:26 PM

Hello,

I'm using a browser extension to convert HTML pages into EPUB files.

It works fine when the web page is in utf-8 but it doesn't like web pages encoded in iso-8859-1, where accented characters are replaced with question marks — the attached screenshots are when opening the EPUB file in SumatraPDF and Sigil.

To make matters worse, the extension replaces the encoding meta line with "charset="iso-8859-1".

I'd like to understand why accented characters are replaced with question marks. Is it a font issue? Or a problem with byte values?

Thank you.

---
Edit: I should have typed "the extension replaces the encoding meta line with "charset="utf-8"

12-23-2024, 06:26 PM	#1
Shohreh Addict Posts: 222 Karma: 304158 Join Date: Jan 2016 Location: France Device: none	[SOLVED] Making sense of faulty HTML to EPUB conversion Hello, I'm using a browser extension to convert HTML pages into EPUB files. It works fine when the web page is in utf-8 but it doesn't like web pages encoded in iso-8859-1, where accented characters are replaced with question marks — the attached screenshots are when opening the EPUB file in SumatraPDF and Sigil. To make matters worse, the extension replaces the encoding meta line with "charset="iso-8859-1". I'd like to understand why accented characters are replaced with question marks. Is it a font issue? Or a problem with byte values? Thank you. --- Edit: I should have typed "the extension replaces the encoding meta line with "charset="utf-8" Attached Thumbnails Last edited by Shohreh; 12-24-2024 at 09:51 AM.