Quote:
Originally Posted by Psyny
One day, a command to let k2pdfopt ignore areas without OCR/Textmarkings when defining coluns would be great.
Something like:
-corc[+] [i|t] <inches>
Where:
<inches> : max ocr/markings distance k2pdfopt will look for another text to define a colum.
+ : allow process of areas without oct/markings
i : to include in the colum the area around markings defined by <inches>
t : to not include in the colum the areas around markings defined by <nches>
I know its too much, just a ideia.
|
It's not too much--I'm just not sure I understand entirely what you want this option to do, but I get the idea that it involves using the OCR layer to detect the columns. I suppose it would also be nice if I could figure out a way to ignore background graphics. There is probably a way to do that using the MuPDF API that isn't too difficult.