View Single Post
Old 02-27-2023, 10:49 AM   #36
isarl
Addict
isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.isarl ought to be getting tired of karma fortunes by now.
 
Posts: 295
Karma: 2534928
Join Date: Nov 2022
Location: Canada
Device: Kobo Aura 2
Quote:
Originally Posted by KIE18 View Post
I know this is not a sigil issue. I was hoping that someone here would help me solve the problem with the book.
If you are on Linux then the iconv utility is likely already installed for you and can perform character encoding conversions. An outline of the process would probably be something like:

1. Explode/unzip your ebook into a temporary directory.

2. For each file in the temporary directory which has the wrong encoding, convert it to UTF-8. NOTE: if you are not careful about which encodings you use, iconv can introduce more encoding errors. Example follows. Note that the curly quotes I use are UTF-8 encoded characters; after running the echo command, “example” is UTF-8 encoded.

Code:
# This example demonstrates an encoding error! Be careful not to do this by mistake.
$ echo '“Hello, world!”' >example
$ iconv -f cp1252 -t utf-8 example
“Hello, world!â€iconv: illegal input sequence at position 18
Here is some bash-like shell code that might work, if you have chardetect available (on my system, this executable is provided by the python-chardet package):

Code:
for file in $(find . -type f) ; do
    if [[ $(chardetect "$file" --minimal) == "Windows-1252" ]] ; then
        iconv -f cp1252 -t utf-8 "$f" -o "$f.utf8"
        mv "$f.utf8" "$f"
    fi
done
3. Zip your book back up.

Character encodings can be a pain in the butt. I hope this helps.
isarl is offline   Reply With Quote