Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 12-19-2008, 02:30 PM   #1
Adam B.
Addicted to Porting
Adam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the rough
 
Adam B.'s Avatar
 
Posts: 1,695
Karma: 7194
Join Date: Oct 2006
Location: Indianapolis, IN
Device: iRex iLiad, Nokia 770, Samsung i760
Scanning Tips For Thin Paper

I'm trying to scan a book I own. I have a sheetfed document scanner at work that can scan to many different formats and perform OCR on it.

I've scanned it as a black and white document at 300dpi. I'm not very impressed with the quality (see screenshot). The illustrations don't look very good, and there are random black dots around the page. In addition, the text doesn't look very clean.

There are quite a few illustrations and footnotes, so converting it to a text document will loose a lot of the "magic" of the book.

I have a plethora of options and formats I can save it into directly from the scanner (PDF, TIFF, etc). The pages are also very thin, so if I scan it as a color document (and greyscale, probably), the back side will bleed through.

Has anyone else scanned a book like this with good results? What settings/format did you use?
Attached Thumbnails
Click image for larger version

Name:	fsmpage.png
Views:	164
Size:	148.0 KB
ID:	19428  
Adam B. is offline   Reply With Quote
Old 12-19-2008, 08:26 PM   #2
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,140
Karma: 24387938
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Clié; PRS-505; EZR Pocket Pro, PRS-600, Kobo Mini
To make a good copy, I suppose I'd scan the whole thing as 300 or 400 dpi tiff, B&W, first--and then scan the images as color or greyscale as appropriate. I'd use the B&W pages to OCR & get text, and insert the color/greyscale pictures during formatting.

If I got bleedthrough on the pictures, and I cared to spend the effort (and if that's the book, it's worth the effort), I'd tinker with them in Photoshop. (Or find out if the author would consider releasing the original images in some kind of digital format, which is possible.)
Elfwreck is offline   Reply With Quote
Old 12-19-2008, 08:51 PM   #3
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Dots on white paper indicate you have too much contrast, lower it and/or play with the gama settings of your scanner.

Scanning greyscale will also lower the situations you report and also give better OCR results sometimes.

What are you pretending to get: a OCRed PDF file? and if so, a image over text or a text and images at the same level?
Or another type of result?
DDHarriman is offline   Reply With Quote
Old 12-20-2008, 12:59 PM   #4
ath
Addict
ath doesn't litterath doesn't litter
 
Posts: 222
Karma: 110
Join Date: Jun 2006
Location: Malmo, Sweden
Device: iLiad, Sony PRS-505, Kindle
Quote:
Originally Posted by Adam B. View Post
I've scanned it as a black and white document at 300dpi. I'm not very impressed with the quality (see screenshot). The illustrations don't look very good, and there are random black dots around the page. In addition, the text doesn't look very clean.
What scanner are you using? What kind of scanning is it intended for? If it's a photo scanner, you should probably not try bilevel scans at 300 dpi.

Anyway, do not let the scanner do the processing (in your case contrasting and thresholding). Instead you scan, say, in greyscale, and then separately do whatever smoothing and sharpening you need before you reduce sampling rate and final thresholding (Photoshop, PaintShop Pro, etc.). That way you avoid scanner firmware problems.

Quote:
I have a plethora of options and formats I can save it into directly from the scanner (PDF, TIFF, etc). The pages are also very thin, so if I scan it as a color document (and greyscale, probably), the back side will bleed through.
The best approach for thin pages is to insert black paper 'behind' the scanned page -- that lessens any reflected light picking up the printing on its other side. It takes a lot of extra work, but if you want high quality you won't mind working a bit.

Quote:
Has anyone else scanned a book like this with good results? What settings/format did you use?
I don't think settings are relevant without knowing what scanner you have.
ath is offline   Reply With Quote
Old 12-20-2008, 02:39 PM   #5
Adam B.
Addicted to Porting
Adam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the roughAdam B. is a jewel in the rough
 
Adam B.'s Avatar
 
Posts: 1,695
Karma: 7194
Join Date: Oct 2006
Location: Indianapolis, IN
Device: iRex iLiad, Nokia 770, Samsung i760
I'm using a Kodak Scan Station 100.

I think that scanning in greyscale will give me the best options. I did a few tests with greyscale and 400 dpi, and the result looks good. I'll have to remove the grey in the background (and inside the letters, time to polish my photoshop skills). The only downside is file size, but I can work on that once I have a good quality scan in place.
Adam B. is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanning paper (out of copyright) books. Charles Gray Workshop 18 03-25-2009 03:06 PM
Quantum Paper claims to revolutionize electronic display/paper industry Alexander Turcic News 12 04-10-2008 12:20 AM
PVI delays flexible e-paper to end-2007 / color e-paper in 2009 Alexander Turcic News 1 06-08-2007 04:52 AM
Bridgestone's super-thin e-paper adds luminance Alexander Turcic News 4 05-25-2007 02:28 AM


All times are GMT -4. The time now is 03:19 AM.


MobileRead.com is a privately owned, operated and funded community.