I am curious too why it does this way.
Yes, I have read the FAQ, the Sticky and some of the related threads, yet no clear answer why:
... so I have too a PDF coming out of scanner.
In Reader I can select the text, and it's rather correct. So the PDF file also contains text, not only images.
Yet, calibre outputs a bunch of images, one per page.
So, again, why calibre does not use the "hidden text"?
Yes, I know it's not best to use PDFs... but what to do when the only source is one of them?!
|