View Single Post
Old 07-25-2009, 07:06 AM   #1
mjmcleod
Connoisseur
mjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to beholdmjmcleod is a marvel to behold
 
Posts: 55
Karma: 11501
Join Date: Jul 2009
Location: Australia
Device: Galaxy Tab
Converting eReader books

I have a collection of eReader books that I've built up over quite a few years. Having "liberated" one using ereader2html.py I'm trying to convert it to EPUB.

What I'm finding is that the resulting output has lots of odd characters, particularly next to things like quote marks and emdashes, but also next to letters that are supposed to be accented.

Searching around I found the suggestion to make sure that "cp1252" is specified in the "source encoding" field when doing a conversion. I haven't found this to work.

The one thing I have found that works reliably is to use Mobipocket Creator to convert the book from HTML to MOBI, and then Calibre has no problem converting the result to EPUB. Creator is correctly identifying the source as being in CP1252 and takes care of it. But this is not exactly an automated process.

Is there something else I'm missing? Some other step I could be taking to fix this? I've tried both the previous (0.5.something?) and current (0.6.0) versions of Calibre.
mjmcleod is offline   Reply With Quote