View Single Post
Old 01-25-2023, 10:53 PM   #1993
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by taddymack View Post
When in native mode, is it possible to crop both the bitmaps and ocr text layer?

As with this command:
k2pdfopt -mb 0.4 -bp -n -wrap- -col 1 -vb -2 -t -ls- mymodelfile.pdf

I want to get rid of pagination also in a text layer. The original ocr gives me better text than -ocr option in k2pdfopt so would like to preserve it.

If not, I will try to do it when outputing the text with pdftotext program and its x,y,W,H crop options.
Since it's already bitmapped, try turning off native mode output (don't use -n). If you don't like that result, try adding -ppgs to the command line as suggested on my FAQ page (search for -ppgs). Let me know if any of this works. If you'd like to PM me the source PDF I can try some more options.

Last edited by willus; 01-25-2023 at 10:59 PM.
willus is offline   Reply With Quote