View Single Post
Old 04-27-2016, 06:10 PM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,164
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Tex2002ans View Post
Side Note: Hmmmmm... I have been writing a Scan Tailor tutorial. Maybe I could toss in some semi-related extra pre/postprocessing in the tutorial.

Depending on how much time you waste on having to clean up the headers/footers in the OCR, perhaps it might be best to preprocess those images (with Scan Tailor), and then crop the headers/footers right out, so that the OCR program can just focus on the body text:

Original Scan: Attachment 148279
Scan Tailor: Attachment 148280
Cropping: Attachment 148281

2 column source... I luckily rarely come across that either. Although I would probably do something similar (come up with Imagemagick way to split the pages in half). I may be contacting you via PM for some examples soon (or you could always contact me).
Old Analog Magazines are fun . It is almost always Magazines with the 2col prob
theducks is offline   Reply With Quote