Quote:
Originally Posted by aanno
Dear willus,
thank you for your amazing k2pdfopt. I tried several tools to make PDFs readable on an eBook reader (including calibre and commercial OCR tools like Abbyy) but your tools works best and fastest.
I'm still a bit curious about the technology: How do you archive the result? I guess there is no way from your PDF result to a 'real' eBook format (like epub)?!?
Kind regards,
aanno
|
Thank you for the nice feedback. If you read through the
k2pdfopt home page, it talks a little about how k2pdfopt works--by analyzing the visual image of each page and looking for rectangular regions (boxes) of text that it can break out into smaller pages. For word wrapping, it then breaks each region into text rows, again using pattern analysis, and then into individual words so that it can re-flow the text if desired. The algorithm is heuristic and does not always work correctly (as I am often told!), but for many "standard" formats that don't have a lot of variation, it works well.
You could try using Office 365 (Word) to read your PDF file--it will directly read PDF files and has good capability to convert scanned PDFs to Word using OCR. You might even try opening the k2pdfopt conversion in Word and see what that looks like--if it's formatted closer to the way you want for an epub. Either way, if you can get your document into Word format, you'll have a lot more capability to convert to epub using
Sigil, for example.