Quote:
Originally Posted by HarryT
I imagine you did your work with laser-printed documents? These are going to be a lot cleaner - and hence more accurately scanned - than grubby old paper books.
|
The technical reports were vintage 30-40 year old Government Printing Office--likely linotype. The books were mostly hardcovers and a couple of paperbacks. Punctuation was rarely an issue.
The quality of source paper turns out to be a big determinant. Font size, too.
For personal use I processed a vintage ACE double (PEOPLE MINUS X/LEST WE FORGET THEE EARTH), yellowed and falling apart, on my cheapie Canon Scanner (slower but still good enough) and it still did a fine job. The biggest issue was still snipped sentences and paragraphs and those I could easily catch by changing font sizes in word and forcing reflows.
The workflow I used was to save each double page scan as "true page" MS Word and then feed it to Wordpad to get a single column text stream. That went into MS word where I set the paragraph indents to an inch or so and reveal paragraph markers. That exposed most of the non-indented lines that needed to be merged back into the paragraph. The spell checker highlighted any character issues.
(Shrug)
The intended output was a clean readable document, not a print replica. (After all, author manuscripts are rarely perfect when the publisher gets them.) For corporate publishing purposes, that should be *cheaper* than processing a typical author manuscript submission, cost-wise, because it wouldn't require content editing.
The cost of doing a good scan in a corporate environment shouldn't be a deterrent. *If* you know what you're doing and you care about doing a good job. Which, admittedly, not all corporate publishers do, given some of the poor ebooks they have sold in recent times. But that's a different discussion, no?