View Single Post
Old 10-18-2016, 11:33 PM   #1308
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,314
Karma: 11087510
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by mauricebis View Post
Hello,

I'm trying to use k2pdfopt. It almost did a perfect job on a few sample pages but I don't understand why a block is not recognised on a column while columns are all similar. The source document is a scanned document and i used the command: k2pdfopt age.pdf -ui- -w 560 -h 735 -dpi 150 -as -col 2 -ac -sm -o k2try.pdf. As can be seen on the result attached, the first lines of the 1st column of the second page are not recognised as a block ? Did I miss an option ?

Thanks for your help.
My guess would be that the gray pixels due to the scanning of the page (see circled regions on attached image) are preventing the line detection. There may be some options you could tweak in terms of the -gtr option. But for books like this I think you're better off running k2pdfopt in two passes. First, convert the pdf from two book pages per page to one book page per page by using the -cbox option (two crop boxes per page). Something like this:

Code:
k2pdfopt -mode crop -cbox 1.137in,0.3018in,4.427in,7.827in -cbox 5.735in,0.3119in,4.336in,7.727in source.pdf -o intermediate.pdf
You may have to adjust the crop boxes depending on how consistent the scanned pages are. If you do this, the auto-straighten and auto-contrast adjust will work better than if you try to use the -ac option to auto crop. So you then process the intermediate output like so:

Code:
k2pdfopt -ui- -w 560 -h 735 -dpi 150 -as intermediate.pdf -o final.pdf
I wasn't able to test this since I do not have your source file, but I think it will work better than what you're doing.
Attached Thumbnails
Click image for larger version

Name:	column.png
Views:	258
Size:	247.8 KB
ID:	152454  

Last edited by willus; 10-18-2016 at 11:34 PM. Reason: Forgot attachment
willus is offline   Reply With Quote