View Single Post
Old 01-15-2012, 03:53 PM   #4
pholy
Booklegger
pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.pholy ought to be getting tired of karma fortunes by now.
 
pholy's Avatar
 
Posts: 1,800
Karma: 7999034
Join Date: Jun 2009
Location: Toronto, Ontario, Canada
Device: BeBook(1 & 2010), PEZ, PRS-505, Kobo BT, PRS-T1, Playbook, Kobo Touch
My workflow scans to .png files, then creates an rtf file. The .tiff files would also be good, but .jpg files tend to lose their sharp edges, making the OCR more difficult and prone to errors. As I understand it, the jpeg compression was intended for photos from nature, where there aren't so many sharp edges.
I do my major corrections to the rtf file in OpenOffice, then output to html files which I clean up with HTML-Tidy and various scripts. The toc and ncx files are mostly boiler plate, and then I zip it into an epub file. The proofreading and corrections take the most time, and I do it both with the rtf files and the html files, and again with the supposedly final epub file.

Hope this helps you somewhat.
pholy is offline   Reply With Quote