Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 12-04-2008, 01:09 PM   #16
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
Quote:
Originally Posted by ProDigit View Post
I also found it a pity that OCR (no matter which program you're using) needs at least 200DPI.
I mean, most software (I'm using a trial here) cost $400. but it really needs about 300DPI to convert text normally?
I mean,I can perfectly read text scanned in 100 or even 75DPI.
For good OCR, scanning at 400 dpi is best, especially if you have any amount of smaller-than-normal text, like 7pt footnotes under a 10pt text page. Then you throw out the scans or downsample them, because they're much larger than you need to read on a screen. (If, however, you need to be able to print them, keep the high-res versions.)

The human eye has incredible software attached; computers have nowhere near our logic capabilities--and while some OCR programs will convert 75dpi pages to high enough resolution to read, it'll be error-ridden and full of "com cob" and "bum his bridges" and "die first lady" and "hi came from outer space" (and don't get me started on the "modem birth control methods"). rn to m is very common, because that switch often gets words that pass spell check. th to di is common in some fonts, where the "h" bar gets thin or broken at the top at low res. I've also seen hi for It in uncorrected OCR passes.

With some scanners & software, there are ways to reduce the filesize (increase the contrast, make sure it's not getting extra margins), but it doesn't reduce it by much. And yes, it's slower to scan at the higher resolution; the best commercial scanners run at about 60 pages/minute; some of them scan double-sided at the same speed as single-sided. (The slowest commercial scanners are flatbed and take somewhere between 30 seconds and a minute to scan a page. You'd have to really, really LOVE a book to bother scanning it that way.)

In my experience, it's worth the slower (feeder) scan rate to cut down on proofreading time. YMMV; if standard OCR errors don't bother you, or you're working with originals that have fonts that don't cause as many errors, you can get away with as little as 200 dpi. Some OCR readers (FineReader; not sure about others) will convert lower-res images and try to read them, but the OCR quality's usually poor.
Elfwreck is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanning in your own books gazza News 125 01-24-2016 04:42 PM
DR1000 Got a used DR1000S, quick set-up suggestions needed! marvinhowru iRex 7 10-15-2010 10:50 AM
Scanning books - New need help Sporadic Workshop 9 04-19-2009 01:11 PM
Scanning pages: how many dpi to convert to PDF? Ammon Workshop 4 12-28-2008 03:16 PM
Scanning books Nate the great Lounge 10 11-04-2007 01:20 AM


All times are GMT -4. The time now is 01:16 PM.


MobileRead.com is a privately owned, operated and funded community.