View Single Post
Old 01-07-2015, 09:49 PM   #967
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,303
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Psyny View Post
One day, a command to let k2pdfopt ignore areas without OCR/Textmarkings when defining coluns would be great.

Something like:
-corc[+] [i|t] <inches>

Where:
<inches> : max ocr/markings distance k2pdfopt will look for another text to define a colum.
+ : allow process of areas without oct/markings
i : to include in the colum the area around markings defined by <inches>
t : to not include in the colum the areas around markings defined by <nches>

I know its too much, just a ideia.
It's not too much--I'm just not sure I understand entirely what you want this option to do, but I get the idea that it involves using the OCR layer to detect the columns. I suppose it would also be nice if I could figure out a way to ignore background graphics. There is probably a way to do that using the MuPDF API that isn't too difficult.
willus is offline   Reply With Quote