View Single Post
Old 01-25-2023, 06:16 PM   #9
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by tomsem View Post
The AABBYY OCR that comes with CZUR is much better than Acrobat's. [...] But at least with default options the PDF it created pages of varying sizes even though the images all had the same dimensions.
Are you talking varying page sizes in Finereader? Or in Adobe? Or what?

This is partially why I recommend Scan Tailor Advanced as a preprocessing step.

Scan Tailor Advanced will take care of normalizing all page sizes, etc.

In Finereader, Cropping to page sizes "exists", but it's clunky.

And if your page is SHORT by a little bit, it's not easy to add the required whitespace to make all images the same size.

This is why it's always helpful to split things into intermediate stages... not necessarily trusting an "all-in-one, just let me press the button" tool/program.

Quote:
Originally Posted by tomsem View Post
Removing page curl seems to require some way of determining 3D. The Fujitsu scanner has stereo 'vision', and CZUR has lasers that draw lines across the material and they use that to determine the curl.
I guess if you go high-tech/advanced, lasers would help in the dewarping calculations.

Scan Tailor Advanced just works based on the curve of the edges of the page:

https://github.com/4lex4/scantailor-...inal-dewarping

It works quite well for most of what I tested on.

I guess the laser would do it more accurately + more automatically, but the image-based fixes works well for most books I've processed.

(Plus, I work from already-scanned/-photographed stuff. I'm not the one actually scanning things in.)

Quote:
Originally Posted by tomsem View Post
This apparently is beyond scope for DIY, which is why the effort to flatten pages is necessary.
The flatter the better. Even with all those newfangled lasers.

Getting it right in the image itself will ALWAYS be better than relying on mathematical dewarping!

Last edited by Tex2002ans; 01-25-2023 at 06:20 PM.
Tex2002ans is offline   Reply With Quote