View Single Post
Old 06-09-2011, 04:12 AM   #13
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
Scanning tips:

Pages:
300 dpi (grayscale)
85-90% JPG quality for a much lower filesize (250-450 MB, depending on the book)

Covers:
600 dpi (colour)
TIFF or PNG because you'll probably want to crop, deskew it, increase/decrease saturation, etc., so it's best if you have something good to work with.

Scanning the pages at 300 dpi instead of 600 will go much faster and believe it or not, it's actually better to scan at a lower resolution because the higher it is, the more chances there are that the OCR process will see various imperfections as commas (,) or add periods (.) where there aren't any just because the book had a small printing smudge somewhere. No OCR solution has provided a 100% accurate output - and it probably never will as long as there are printing flaws in basically any book (except maybe for the recent prints). You see, statistically, the more the pages there are in a book, the more chance there is that there's going to be at least one flaw in one paragraph somewhere in the book.

So in order to provide a pleasant reading experience, proofreading is essential. Do an initial "grunt work" sweep in FineReader and correct any issues as you go along. Batch replace quotation marks (but never batch replace commas and periods), then do a second "pleasure" proofreading for the (semi)final version. If you find anything out of place you can highlight it inside the reader for a future source review.

If you'd like to match it against a scanned image, you could try setting the window transparency and overlap the windows to spot the difference. Nvidia drivers can make a window transparent (don't know about ATI drivers) using a combination of a hotkey+mouse scroll. Or you could use software such as Actual Window Manager or just Actual Transparent Window: http://www.actualtools.com/products/ (shareware). I once used this method to match line spacing for a PDF document. Don't know if you'll get the same result with an ePub, given it's free-flow nature... Works best for PDF that uses the original print line breaks and the exact same font.

Last edited by DSpider; 06-09-2011 at 04:14 AM.
DSpider is offline   Reply With Quote