Thread: OCR engine
View Single Post
Old 05-03-2014, 06:12 PM   #55
DebbyS
Zealot
DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.DebbyS ought to be getting tired of karma fortunes by now.
 
DebbyS's Avatar
 
Posts: 115
Karma: 1472692
Join Date: Jul 2011
Location: Albuquerque, NM
Device: Jetbook Lite; Samsung Galaxy Tab 2 (7.0)
For my current project, I did a search for "any digit"o [any digit + oh] so I could see if the "o" should be "0" (zero). The OCR was also italicizing words it shouldn't have, but it was largely extending italicized words in the Huichol and Spanish languages to the next few English words, so in the end I'll search for [blank] italicized to see if I missed any, as well as searching for [blank] [DPCustomMono] and trade that for Times New Roman. Accented "o" (oh) tends to become a "6", too, but that's easy to see. I'm sure if the font in the book had been larger than 10point or so, the accuracy of the OCR would have been much better. I'm really glad to have that weird font to use, but will also check into "regex" to see what it is and if I can use it as well
DebbyS is offline   Reply With Quote