Quote:
Originally Posted by nickq
...
I've read that it works quite well with native pdf's. However, I have a couple of scanned pdf's that I would like to read on my e-book reader. The scans were not very clean, but OCR has been applied, and the text is selectable; but when I try and use k2pdfopt, it processes each page as single image blocks without re-flowing the text, detecting the columns, or doing anything, really, except for resizing the blocks to the size specified.
...
Posting this anyway in case anyone else has the same problem. Would appreciate any suggestions for fine-tuning it though.
|
Using Briss or Pdfscissors beforehand to quickly partially crop the margins (just eliminating the black parts but not necessarily coming too close to the text proper), would usually faster or easier solve such situations with scanned (not very clean) pdfs for me, like when there are a lot of black shades in margins and pages have different margins width and positions due to manual scanning of double-paged scans.
https://www.mobileread.com/forums/sho...&postcount=197
Then after such quick cropping (using separation on the right and left pages in Briss or Pdfscissors), if it was A5 pdf I would use fitwidth(landscape) mode in k2pdfopt, if it was A4 one-column pdf, reflow mode and for A4 two-column pdf, 2-column mode.