View Single Post
Old 12-19-2009, 04:51 AM   #3
Mike L
Wizard
Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.
 
Mike L's Avatar
 
Posts: 1,479
Karma: 3846231
Join Date: Apr 2009
Location: Edinburgh, Scotland
Device: Kindle 3, Samsung Galaxy
Ficbot,

I would've thought that OCR software worked in exactly the same way, regardless of the language. It looks at each character separately, and tries to determine which letter or symbol it represents. It doesn't know anything about words or sentences or meanings. It justs converts shapes to letters, etc.

So the fact the book was partly in French and partly in English is probably irrelevant. More likely, either the software is poor or the original printed pages are difficult to read for some reason.

To determine which part of the system isn't working properly, try eliminating each variable in turn. Start by scanning an image. Does the result look like the original? If so, the scanner itself is probably OK. Next, try scanning a simple page of text, with a single clear font. If the OCR fails to convert it, then its the software that's at fault.

Finally, if you can get access to a different type of scanner, test it with the English / French book that was causing the problem. If the results are still bad, that suggests that the problem lies in the quality of printed page, or perhaps in the fonts.

I hope you manage to find a solution.
Mike L is offline