MobileRead Forums - View Single Post - k2pdfopt: optimizes PDFs for viewing on e-readers

gg4u · 11-19-2018, 09:53 AM

oh thank you Willus,

keeping eng.file only will free up some space on disk.

Would you suggest hpw to make best use of k2pdfopt ?

I'd like to reflow a pdf - of scanned images - in a epub containins figures, and chapters.

k2pdfopt seems to detect where images are, I processed the original pdf into OCRed version, and characters are blurred.

I tried to make comparison by using ghostscript and tesseract:
from pdf to tiff, from tiff to txt.

Here, results where quite good but I miss all the figures and markup for chapters.

As final result for written text, I would like to have epub or mobi (sharp rendering of chars) , not pdf , but yet with the figures - and TOC .

Maybe is there another file but txt, that tessearct export to and that will keep images (RTF)?

I could eventually manually mark the TOC - which is correct markup?

What kind of steps should I take to convert pdf in epub containing images and markup ?

I also shared this thread https://www.mobileread.com/forums/sh...d.php?t=312652

Can I also ask you how you approached the problem to be able detect figures in PDF - interested in problem solving

11-19-2018, 09:53 AM	#1622
gg4u Junior Member Posts: 7 Karma: 42206 Join Date: Nov 2018 Device: Kindle 8	oh thank you Willus, keeping eng.file only will free up some space on disk. Would you suggest hpw to make best use of k2pdfopt ? I'd like to reflow a pdf - of scanned images - in a epub containins figures, and chapters. k2pdfopt seems to detect where images are, I processed the original pdf into OCRed version, and characters are blurred. I tried to make comparison by using ghostscript and tesseract: from pdf to tiff, from tiff to txt. Here, results where quite good but I miss all the figures and markup for chapters. As final result for written text, I would like to have epub or mobi (sharp rendering of chars) , not pdf , but yet with the figures - and TOC . Maybe is there another file but txt, that tessearct export to and that will keep images (RTF)? I could eventually manually mark the TOC - which is correct markup? What kind of steps should I take to convert pdf in epub containing images and markup ? I also shared this thread https://www.mobileread.com/forums/sh...d.php?t=312652 Can I also ask you how you approached the problem to be able detect figures in PDF - interested in problem solving