MobileRead Forums - View Single Post

Karl Murks · 01-04-2012, 12:00 PM

Quote:

Originally Posted by DSpider

French: "Xavier Molénat" is recognized as "Xavier Molenat" throughout the book

Funny thing: That one's easily fixable by adding the 'é' character to the list of allowed characters.

But the inverse is not true. If a word gets scanned that does not contain any accent or umlaut but the dictionary contains a word that is identical except one such character it gets mercilessly replaced without any chance of opting out - even if the scanned page is in perfect condition.

I tried the dictionary approach, btw. Didn't help.

And now good luck finding all such words! Most of the times it's names so that just proofreading the scanned document wouldn't help because you can't easily tell if the name is wrong in many, many cases.

This is something that would be fine as an option but having such a feature on by default with no decent option to disable it is - pardon me - just a sign of fundamentally broken software. There is no means to get around it and no error threshold below which this nonsense isn't done.

Yes, FineReader 11 is a lot better than 10 but it still got major problems distinguishing 'i' from 'l' (why? Is the gap below the dot that hard to detect?) or 'm' from 'rn' and these 2 along with the stupid umlaut problem are my main source of frustration with it, mainly because these are so easy to overlook when proofreading so if you want to make sure you have to do it at least twice, preferably by different persons.

Effectively these issues make up 90% of all the proofreading time because they happen far more often than any other misrecognition.