View Single Post
Old 01-28-2011, 09:42 AM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
There is no OCR phase in Calibre, but some of the source documents people use are rtf/txt/html files generated directly from OCR conversion software. Depending on the quality of the OCR software there can be a variety of issues.

I've actually been scanning some favorite paperbooks that aren't available electronically lately, I think I'm going to add a special Heuristics function just for cleaning up ABBYY generated html - it's not fun going through it by hand, that's for sure.
ldolse is offline   Reply With Quote