Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 06-24-2011, 11:34 AM   #16
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

Quote:
Originally Posted by ATimson View Post
Topaz is designed to be an easy format for scanned books - it's got the images on the page, combined with a often-not-very-good OCR'd copy of the text for searching. Those scripts only get you that OCR'd text.

If your resulting book was the same, with no blatant OCR errors and with formatting intact, you're lucky.
That is not quite correct. Using the calibre plugin tool will only get you the OCR'd text version because a plugin can only pass along one type of ebook not two. So it passes along the html version so that you can convert from htmlz to epub and then on to whatever other format you like after fixing any errors that bother you.

The other tools (KindleBooks, DeDRM) in the "tools" can provide you with the OCR'd html plus the complete set of page images (exact copies) written out as svg images embedded in xhtml pages so that you can read the book with any modern browser that understands svg images (read that Safari, Firefox 4, FireFox 5, etc).

You can also easily modify the tool to not imbed the svg in xhtml and instead create pure svg images (one per page) and then can convert the book to an exact set of png or jpeg images easily or create an image only pdf file (it will be quite large!, and then use Acrobat Pro to OCR it yourself to make it searchable).

Until someone can create some sort of svg glyph to outline font character recognition program, there is no other way to deal with the issue.

Hope this makes things clearer.
KevinH is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Warning: resource oebps/font/Agaramond-regular.otf cannot be decrypted prepress Calibre 11 07-18-2011 12:55 AM
My Run-In With Topaz SpiderMatt Amazon Kindle 50 03-13-2011 06:48 PM
Decrypted Topaz Support - time to revisit? ldolse Calibre 20 08-13-2010 12:14 AM
Beautiful Topaz Gideon Amazon Kindle 21 06-10-2009 02:43 PM
Topaz looks horrible... AnemicOak Amazon Kindle 17 03-03-2009 10:18 PM


All times are GMT -4. The time now is 08:55 PM.


MobileRead.com is a privately owned, operated and funded community.