MobileRead Forums - View Single Post - Boston Globe article titled "Nuance's OmniPage 17 has scan-to-Kindle feature"

zerospinboson · 06-07-2009, 07:18 AM

Quote:

Originally Posted by kamm

Yeah, right because it's soooo difficult to use when you have to make a whopping TWO CLICKS, huh?

Pleahhhhse. Did you even try anything other than stupid English text with its lame, limited character set? Ominpage has its roots in the old Recognita, then-world's best OCR software, let's get real - try using foreign languages, especially with non-Latin character sets and you'll see how it works and THEN (only then)

Well, one of the major gripes I had with OP 16 was that you can't specify the input language for a scanned document, especially in the cases where a book contains more than one language, nor does it ask what language the scanned stuff is in at the time of processing.
I don't care about greek/cyrillic scripts, although I remember it having lots and lots of trouble with german Heidegger Gesamtausgabe volumes (German with lots of Greek mixed in).
Similarly, it seemed to make little use of the dictionaries when it was trying to recognize words, with lots of "recognized" OCRed (marked as unsure) text consisting of letter combinations that weren't words at all, apparently because it doesn't always use heuristics for guessing which dictionary word a recognized letter combination should be. (although it does so most of the time, the 'errors' it gives are often just as clearly scanned as the rest of the page)
Abbyy FR 9 is a lot better at this, although its recognition abilities are also far from perfect.

Edit: After trying Omnipage 17, I have to say I'm almost shocked at the mistakes the thing makes.
Firstly, on the first PDF it just crashes, always at more or less the same point (800p JPG book that I like to use as a test)
on a second (GA 70) it mistakes half the german for greek, and generally makes gibberish out of the text (300dpi)
on a third title it cuts the inside 10% off every left page in a dual-page scan with the "look for facing pages" feature enabled. (Rather than looking for facing pages, it just seems to cut every image in half, no matter where the text starts or ends.)
While it's processing you can't edit any of the previous pages,
you can't delete pages from the scan,
you can't remove areas from the recognition thing,
you can't reprocess a page easily after adapting the areas that it needs to process, and
on the OCR proofreading screen it still has the weird button setup where the standard button is always 'ignore' (which it doesn't remember), and the "confirm edit" button is always called "change" with the first suggestion always selected as though it will still change your manual edit to its own suggestion.

In all, I wouldn't use this even if it was free.