View Single Post
Old 08-12-2009, 11:43 AM   #73
Ea
Wizard
Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.Ea ought to be getting tired of karma fortunes by now.
 
Ea's Avatar
 
Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
Quote:
Originally Posted by corroonb View Post
A trick I've found with OCR errors is to identify the consistent errors and look for other words that might not be picked up with a spell check. Obviously this only work well if the error occurs all the time as you would expect of an automated process.

For example I had an OCR text that had replaced every cl at the start of a word with d. It was easy to find the words like dothes and doset with a spell checker and do a global replace but I had to search for every word that makes sense with a cl and a d in front of it using a dictionary. And you can't use a global replace with dean/clean or dosed/closed as the context has to be checked.

Apologies if this is obvious.
It's a good idea. I've sort of been doing this already, but it's the same as being completely conscious about it, and I've never thought to use a dictionary to help.
Ea is offline   Reply With Quote