View Single Post
Old 02-24-2023, 10:43 AM   #6
Lukusaukko
Connoisseur
Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.Lukusaukko ought to be getting tired of karma fortunes by now.
 
Posts: 56
Karma: 392326
Join Date: Feb 2023
Device: Kobo Libra 2
Quote:
Originally Posted by retiredbiker View Post
I do multi-column old magazine stories, the pdf coming from, say, Internet Archive. Any text that is already in these is worthless, it would take forever to correct it by hand.
It's almost like my use cases, though my sources for pulps and weird fiction are often more or less proofread - it's just that they have also attempted to replicate the original's layout, which makes them a pain to read on a e-reader... and often it's the only source available.

I have used gImagereader to OCR a couple of sources where the source was only avaible as a scan (as you noted, Archive's TXT or EPUB versions are often worthless), but only for short texts. I'll have to check out OCRFeeder.
Lukusaukko is offline   Reply With Quote