View Single Post
Old 11-26-2010, 01:36 PM   #31
aidren
Edge User
 
Quote:
I think it has to do with the OCR used by the person who originally scanned the work
I agree. In these days of 'make it convenient' software, most people use auto settings for almost everything. So, it begins with poor image scans that make it more difficult for the OCR procedure, and then moves on to automated OCRs. I won't get into all the details on this, but mostly, they are just not that good and very few are capable of handling more than one language in one document. I've even tried ReadIris Pro (supposed to be able to combine two languages) and has a training feature. I was trying to ocr an old book that had a fair amount of classical greek. I ended up abandoning it after two days of persistence. It kept confusing some of the greek letters for some english ones resulting in numerous crashes and work lost. I may look into again in the future but just don't have the time right now.