View Single Post
Old 02-27-2023, 11:54 PM   #41
KIE18
Enthusiast
KIE18 began at the beginning.
 
Posts: 28
Karma: 10
Join Date: Feb 2023
Device: none
Quote:
Originally Posted by isarl View Post
If you are on Linux then the iconv utility is likely already installed for you and can perform character encoding conversions. An outline of the process would probably be something like:

1. Explode/unzip your ebook into a temporary directory.

2. For each file in the temporary directory which has the wrong encoding, convert it to UTF-8. NOTE: if you are not careful about which encodings you use, iconv can introduce more encoding errors. Example follows. Note that the curly quotes I use are UTF-8 encoded characters; after running the echo command, “example” is UTF-8 encoded.

Code:
# This example demonstrates an encoding error! Be careful not to do this by mistake.
$ echo '“Hello, world!”' >example
$ iconv -f cp1252 -t utf-8 example
“Hello, world!â€iconv: illegal input sequence at position 18
Here is some bash-like shell code that might work, if you have chardetect available (on my system, this executable is provided by the python-chardet package):

Code:
for file in $(find . -type f) ; do
    if [[ $(chardetect "$file" --minimal) == "Windows-1252" ]] ; then
        iconv -f cp1252 -t utf-8 "$f" -o "$f.utf8"
        mv "$f.utf8" "$f"
    fi
done
3. Zip your book back up.

Character encodings can be a pain in the butt. I hope this helps.
I am using windows 10.
KIE18 is offline   Reply With Quote