Quote:
Originally Posted by ittiandro
Do I understand correctly that to solve my problem ( controlling the fonts of the EPUB converted books) I need:
1. To use an OCR software to convert the PDF book into HTML foirmat.
2. Once this is done, I'll be able to edit the HTML text by changing the font and other desired parameters to be incorporated in the final EPUB conversion.
3. Reconverting the edited HTML text to its final EPUB format, which will hopefully reflect the editing done at the OCR stage.
It sounds very laborious, also considering that I have quite a few books, but I'll give it a try as soon I am sure I am going in the right direction, because I am totally new to all these OCR niceties. Do you suggest a particular OCR software?
Thank you for your help and comments
Ittiandro
|
If you had not guessed, PDF (anyway) is a pain.
Not all PDF is images. Open the Pdf normally. Select some text. Click Copy.
(paste it into a New Notepad file as a final test.
Success! You have a convertible document (without OCR) candidate.
No, you have an image that will need to be OCR'd
OR
consider getting a large format tablet that will show PDF to avoid the pain for those documents.