MobileRead Forums - View Single Post - Heard back from Tor on why Topaz for some books...

tompe · 03-13-2009, 03:40 PM

Quote:

Originally Posted by sirbruce

If they have to OCR it, why not just OCR it to text rather than PDF? What, does the OCR software construct it's own font based off letterforms and store them in a PDF which is why it has to be converted to Topaz? I'm afraid I don't follow the reasoning that OCR means PDF means TPZ.

I think they meant "PDF->OCR->Topas". As i understand it Topaz avoid the actual character recognition by using the scanned image of letters. They then just have to do a categorization and do not have to map the categories to actual letters.