MobileRead Forums - View Single Post

pepak · 05-16-2009, 01:21 AM

Quote:

Originally Posted by ahi

My question is, am I better off converting existing eBooks into some editable format (like Project Gutenberg's EPUBs), do my fixing, then convert them back; or is it better to just work straight from the plaintext and make my own eBooks from scratch?

It really depends on the book. I have often found that, ironically, taking the paper book and scanning/ocring/fixing it myself is less work than taking a "finished" e-book and fixing that. In other cases, starting with plaintext (or stripping all tags from HTML) and fixing that can give a quite satisfactory result.

In any case, I very much doubt you will find a way to do the fixing fully automatically (by running a script) - at least I haven't, yet, and I have been trying to do it since I bought the reader last march. The closest I got is a semi-automatic process of checking each book manually and devising proper regular expressions to fix its quirks.