View Single Post
Old 10-14-2012, 01:19 PM   #197
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by ectoplasm View Post
This is actually pretty sweet for automatically cropping text based PDF page margins. This is the first tool I found that does this automatically. If there are others, please comment. I'm not interested in the programs where you have to select a region by hand.
But if our PDF is image with text layer in the background we should be very much interested, because often we should first crop such PDF in Briss or PdfScissors, A-Pdf page crop etc. and then and only then use soPdf or k2pdfopt for much better result.

So it is 2 or 3 step process for PDF image.

1. Quick OCR-ing by Abyy, Acrobat etc. because there is usually no need for a great OCR behind the image.
2. Cropping roughly by Briss, eliminating headers/footers if needed (soPdf removes headers/footers like page numbers automatically).
3. Cropping in soPdf or k2pdfopt.

Often k2pdfopt should be enough as standalone (i.e. 1 step process) though, even for pure image (non OCR-ed).

With soPdf OCR layer stays there after cropping and PDF is about the same size i.e no rasterization involved that makes PDF bigger as with k2pdfopt.

Example:

1st picture is original, 8 pages of scanned pdf OCR-ed.
2nd picture is that original croped by Briss (just roughly i.e. not getting very close to the text proper but headers cropped)
3d picture is original cropped by briss and then cropped additionally in soPdf (to fit hight).
4th picture is original cropped in soPdf directly.

1 2 3 4 -click on a picture to enlarge view

As we can see soPdf didn't cut those two left margins on two pages (4th picture) when directly applied, whereas after cropping in Briss soPdf cropped those two margins correctly and we eliminated headers/footers by Briss also.

Briss and soPdf or k2pdfopt are complementary because usually there are pages that stick out in Briss (inch or half of an inch from stacked majority on odd or even pages) and we can freely include them all for cropping if we are to use soPdf or k2pdfopt after Briss for very precise cropping.

Last edited by markom; 09-05-2014 at 09:42 PM.
markom is offline   Reply With Quote