View Single Post
Old 01-21-2014, 08:17 AM   #3
parkher
Evangelist
parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.parkher ought to be getting tired of karma fortunes by now.
 
Posts: 467
Karma: 369018
Join Date: Nov 2010
Device: BL Alita/Mimas/Ares, OB Note2/Note, KA One/H2O/HD, S PRS T2/T1, PB 902
Quote:
Originally Posted by eBookLuke View Post
Try to use PerfectEpub to clean the OCR errors. It solves all your problems:
http://lukesblog.it/ebooks/ebook-tools/perfectepub/

Luke
Thanks!
It really does all those things that I usually do with regex.
Splitting " ", for example - I do this too

Not sure why it is called PerfectEpub, though. It is more than that. PerfectHTML too, etc.
It fixes the text in OpenOffice and then you can do whatever you want with it: to save as html or as odt and convert odt to epub with the Calibre converter, for example.

What is the best strategy to work with it on an epub I already have, though?

Probably: to convert epub to htmlz (with the Calibre converter, for example), to unpack htmlz and then to open html in OpenOffice. With this approach all the pictures show up in OpenOffice too.

Or do you have a stand-alone version, perhaps a PerfectEpub tool that can be launched from SIGIL with "Open with"?
parkher is offline   Reply With Quote