MobileRead Forums - View Single Post

GrannyGrump · 09-19-2015, 11:09 AM

Some recent stories I worked on, I had to use a free OCR service, it gave results like these:

the first was very tightly kerned, and the OCR got all the characters right, but not the word breaks.

Quote:

Thenthroughanotherkitchen,
where redrustwasmaking itsfull
mealof a comparativelymodernrange.
Then into the greathall wherethe
old armorandthebuff-coatsandround
capshungonthewalls,andwherethe
carvedstonestaircasesran at eachside

Another was at least 75% gibberish.

Quote:

and tho enorrnous vuoee of solid silver, we
heavy for l1i1n to 1il't—r_-.ve1r these were hie-
hrrd lrc not found tlrer|r—he, by his own skill
rr,||r_i tunnirrg P He went about In Llro rooms,
wurrlriirg one after the other the beautiful,
mre things. Hr: oun:ssr:1.l the gold and the
jrrwele. He tlrrerr his nrms round the great
silver vnsos; he wound round lrirnself tho
l'iun\"j rod velvet of the crltlnllr Wlroro tllo
grithrrsgleaured in embossed goldnrnd shi ‘~'IJ1‘Ei.l

I had to pretty much manually go through the first and force word breaks, and manually transcribe the second.

Abby Finereader is not in my future, sorry to say. Is there a free software that might give better results than I got here?

Regex is absolutely my weak spot, but does anyone have any suggestions for the next time I run against this type of situation?