Quote:
Originally Posted by Tex2002ans
Side Note: Hmmmmm... I have been writing a Scan Tailor tutorial. Maybe I could toss in some semi-related extra pre/postprocessing in the tutorial.
Depending on how much time you waste on having to clean up the headers/footers in the OCR, perhaps it might be best to preprocess those images (with Scan Tailor), and then crop the headers/footers right out, so that the OCR program can just focus on the body text:
Original Scan: Attachment 148279
Scan Tailor: Attachment 148280
Cropping: Attachment 148281
2 column source... I luckily rarely come across that either. Although I would probably do something similar (come up with Imagemagick way to split the pages in half). I may be contacting you via PM for some examples soon (or you could always contact me). 
|
Old Analog Magazines are
fun .

It is almost always Magazines with the 2col prob