Quote:
Originally Posted by willus
I have no recommendation for the first file. It is so random--every page seems to be formatted differently.
|
I just found a better version of 'Die Geschichte der Berliner Arbeiterbewegung. Teil 2'.
And it was this version, I hat worked with. I only couldn't find the link to it.
Now I found the link in my jpdownloader.
https://archive.org/download/bub_gb_...knAQAAIAAJ.pdf
I now will test your settings.
My procedure is, to clean the files in 'foxit phantom pdf' or in 'adobe acrobat' (cropping is very fast in acrobat). For deleting headers or footers, that are close too to the text, I use the Foxit 'Comment rectangular function'. I also delete all pages (cover, title page, table of contents, dedication, copyright, index), except for the text pages, notes, footnotes, bibliography before editing with k2topdfopt. That is necessary for getting better results. After the k2pdfopt process I add Cover, title page, dedication, copyright. TOC I do manually with Foxit. OCR: Is it better to do ocr with k2pdfop or is it better to do ocr before with Foxit. I think for Frakturschrift I should do it with k2pdfopt and Tesseract traindata. So I did it.
So many questions. Why?
Because I want to know whether cleaning up with Foxit and cutting with Acrobat will unnecessarily inflate the file and how to get smaller files.
Thanks a lot for helping me finding better solutions.