07-17-2012, 06:58 AM | #1 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jul 2012
Device: Samsung Galaxy S II
|
Problem with EPUB/OCRed PDF and their convertion
If you want to answer this question with a suggestion of other softwares to use, you're welcome too.
Well, my issue is that Calibre seems to be aimed exclusively to tablets, cell phones and e-book readers in general, but I usually like to read them at home and work PC's with large screens, and, sometimes, I print them (not only e-books, but also scanned documents in PDF format). So sometimes I want to convert the e-books to A4 scale (so they are better in the screen and even better for printing), but the "input" and "output" screens of the convertion wizard only have lots os options related to portable devices (like predefined unchangable input and output "profiles"), and that leads to small pages with big letters, and the books get about 10 times the number of pages they should have. Do you have any idea of what options I have to overcome this dificulty? Also, when it comes to the books that are related to my job, there are some of them I whould need to edit. I'm ok with transforming e-books in EPUB format to RTF, but when it comes to scanned books (in PDF), I'm having problems with the fact that the PDF's have 2 layers (according to what I've studied): one with the image of each page, and other with the text. When I try to convert from PDF to RTF, only the image is "sent". Do you know why? The "save as text" that every single PDF reader has (including adobe's) is useless, since it breaks every single line, and that makes it impossible to have the text justified (and that's exactly what I'm trying to overcome). I've tried to find these answers in other places, but Calibre documentation is poor about that. Thanks. |
07-18-2012, 08:03 AM | #2 |
Member
Posts: 24
Karma: 505048
Join Date: Jul 2012
Device: Samsung Galaxy Mega 5.8, Kobo Mini
|
Have you tried using ABBYY Finereader to "read" the pdf?
PDF is not really a good source for conversion. A lot of mess comes up with PDF as a source. Split text at the bottom of the page, missing letters, etc. My solution to that is to process PDF through OCR software like ABBYY. Then save it in DOCX or RTF. It even retains the formatting if you choose Exact Copy option. It will also copy the image but you have to tweak it a little for it to read the WHOLE picture. |
Advert | |
|
07-24-2012, 06:38 AM | #3 |
Junior Member
Posts: 2
Karma: 10
Join Date: Jul 2012
Device: Samsung Galaxy S II
|
You helped a lot. I was acctually looking for something opensource. But when web don't have it, it's better just to have the problem solved in other ways. Thanks for letting me know about these features I didn't know about them. I'll maybe have a try Acrobat X, and, alternatively, ABBYY (which is much more affordable).
|
Tags |
convert, epub, justification, ocr, pdf |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Default font in EPUB convertion | kandombe | Conversion | 7 | 06-13-2011 03:06 AM |
PDF to epub convertion grief; keeping indentation | Aia | Calibre | 2 | 10-31-2010 05:09 PM |
Doc to Epub convertion problems | johnbajer | Calibre | 5 | 06-04-2010 05:30 PM |
Cover pictures after convertion from ePub to Mobi | paulpeer | Calibre | 8 | 03-23-2010 09:23 AM |
Best PDF Convertion Tool | Nathan Campos | Workshop | 5 | 12-27-2009 10:47 AM |