View Single Post
Old 02-17-2015, 08:46 PM   #998
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by dhdurgee View Post
That should do the trick. As long as it is clear how this works it can be dealt with.

On a totally separate subject, I know that your tool does OCR for the purpose of enabling searches to work on a scanned document. Is it possible now, or would an enhancement make sense, to offer an option to create a mobi or epub document with the OCR text and the graphics/images from the source as opposed to a PDF document? I am aware that no OCR is perfect, so there will certainly be errors in the output, but the ebook format will be a much more flexible fit to a dedicated reader than even the current output of the package.

This might be beyond your design goals for this tool, but I think there might be others beside myself interested in a native format document for a dedicated reader from less than perfect sources. Your tool does well with adapting a scanned PDF document for a dedicated reader. This might be a logical further step along the path.

Dave
I have thought about other output formats off and on, but there are several other tools that will do that kind of thing (calibre, for one), and probably better than I could, so I'm trying to keep k2pdfopt focused on its core functionality. You could use k2pdfopt (or any number of other tools) to OCR the text in a file and then do the conversion in calibre after the OCR is created. This is fraught with errors, though, and inevitably requires a lot of hand editing. See some of the threads listed in my pdf conversion tips.
willus is offline   Reply With Quote