06-14-2006, 10:58 PM | #1 |
Member
Posts: 10
Karma: 30
Join Date: Jan 2006
|
Scanning paper (out of copyright) books.
I have many, MANY books-- some of them are out of copyright, and for others I was able to get permission to ebook them so long as it isn't distriubted and the original copy is destroyed.
But that leaves the question of how do I do it? Flatbed scanners seem desructive and although I have a very good OCR program (Abby fine reader), the "lift" in the spine seems to cause problems. That's not a problme for the "Scan and destroy" books, but my out of copyright pulps from the 1920's are a different matter. (and rather important, as I'd like to read them, but too much reading will also destroy them). I didn't see any other place here to ask this question, so I was wondering if I could recieve any help. |
06-16-2006, 02:38 AM | #2 | |
Addict
Posts: 222
Karma: 110
Join Date: Jun 2006
Location: Malmo, Sweden
Device: iLiad, Sony PRS-505, Kindle Paperwhite & Oasis
|
Quote:
Scanning books quickly means, unfortunately, cutting them up, and running them through a page-fed scanner. You can scan page spreads with a flat-bed scanner, but it will stress the spine and the hinges of the book in a way that doesn't happen with ordinary reading. I've done several late 19th century books on a largish flatbed, and if the books don't break up entirely, the back cover is usually ripped afterwards, and some of the sections are starting. There is also some risk of ripping or folding a page due to clumsy handling. There are scanners where the scanning area extends to the edge of the device (see Plustek OpticBook 3600, or the 3600 Plus if you're going for PDF -- and I think Xerox has/had a similar scanner). This lessens the stress on the spine, but it doubles the effort and time, as well as doubles the risk of damaging the page. I know of some experiments with a camera (a digital camera is a kind of overhead scanner, and with a film camera you can often get decent scans made from the film), but it definitely requires more than just point-and-click. You will at least need some kind of good camera stand, as well as good, even lighting. See project Runeberg for more info. Last edited by ath; 06-16-2006 at 03:19 AM. |
|
Advert | |
|
06-16-2006, 02:46 AM | #3 |
iLiad Maniac
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
|
What about taking photos in highres of the pages, like the professional bookscanners do. Then do a batch transform of your image to change the pages, that the distortion gets removed. then do the OCR.
|
06-22-2006, 08:16 AM | #4 |
Intentionally Left Blank
Posts: 172
Karma: 300106
Join Date: Feb 2006
Location: Royal Oak, MI, USA
Device: Nook STR
|
I'm sure you could get some help in the forum at the Distributed Proofreaders website. You may even want to run your projects through them, getting you an entire network of proofreaders.
Check it out at: www.pgdp.net |
09-28-2007, 08:34 AM | #5 |
Zealot
Posts: 118
Karma: 306
Join Date: Sep 2007
Device: Sony PRS-500 Archos 704 wifi
|
From Paper to Digital Books
See my thread "do-it yourself repro v-cradle for paper books" in Reader Accessories
|
Advert | |
|
09-28-2007, 09:44 AM | #6 |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
There was a thread by Bob Russell about a scanner that was designed for bound books and had them over the corner of the scanner so a page would lie flat. It seemed to work well. I will look again for the article.
|
09-28-2007, 12:45 PM | #7 | |
Gutenberger
Posts: 142
Karma: 700
Join Date: Jul 2007
Location: Lisbon, Portugal
Device: Cybook Gen 3
|
Quote:
I also suggest you to read Project Gutenberg's Scanning FAQ. |
|
10-19-2007, 06:30 PM | #8 | |
Addict
Posts: 208
Karma: 575
Join Date: Oct 2006
Location: California
Device: Various Kindles, iPhone, iPad, Galaxy 10.1
|
Quote:
Any flatbed scanner is going to take longer to scan than an overhead setup like ereszet's (which is a setup I'm trying to recreate myself for a large book I have), but the Opticbook is the best out there as far as I've found for a low-cost flatbed solution. |
|
01-22-2009, 03:17 PM | #9 |
Member
Posts: 20
Karma: 10
Join Date: Jun 2008
Device: Cybook Gen3
|
Alternatative photographing options
You can also check out what we're doing at http://bkrpr.org. We have instructions for putting together a camera mount using cheap consumer digital cameras and a v book cradle like ereszet's. Actually I need to look into his version, mine is pretty much cobbled together wood.
|
02-26-2009, 05:01 PM | #10 |
Member
Posts: 16
Karma: 26
Join Date: Dec 2008
Device: Sony e-book
|
have I seen hand held scanners which you can run over the page? If so, it would be slow, but non-destructive.
Glenn Cornish |
03-18-2009, 06:52 PM | #11 |
Other
Posts: 143
Karma: 644
Join Date: Jan 2008
Location: Norway
Device: Cybook, Kindle
|
Any one have any clues to what kind of magic I should ask my favourite image editor to perform in order to reduce the background noise of my scanned pages. I have tried working with saturation, hue, rgb-channels, contrast etc and the result is becomes better than the one straight from the scanner, but not as good as the Google books. My improvements are by change since I am clueless at this. Any one with some general advice on the matter or perhaps a linky?
Of course it would depend on a lot of factors how one should behave oneself to get the best result, but there should probably be some general rules or principles on the matter. (Trying to use Irfanview which has batch processing with advanced options. The point is to get them to my Cybook in one piece without any OCR) |
03-18-2009, 06:54 PM | #12 |
zeldinha zippy zeldissima
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
|
i don't know irfanview but in photoshop i would try adjusting the levels (select the text as black, and a slightly noisy area of the page as white), and contrast.
|
03-19-2009, 02:57 PM | #13 |
Guru
Posts: 860
Karma: 4380
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
|
OpticBook 3600 is the solution: cheap, easy efficient!
|
03-19-2009, 03:56 PM | #14 | |
Bookaholic
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
|
Quote:
I've never scanned a book before, but do use a scanner many times a day. Did you scan your pages as RGB, Grayscale or Line Art (B&W)? Not sure which would be best, might depend on your source book quality. Depends on what you mean by background noise, but have you tried messing with the images levels (that's what Photoshop calls it anyway), maybe that's what you meant by rgb-channels. Usually if I have some speckles or something in the background (white) part of a scan I can mess with the levels and get rid of it. Some software has a despeckle option that I'm told can be useful on some books, but it'll also get rid of punctuation a lot of the time. |
|
03-19-2009, 03:58 PM | #15 |
Grand Sorcerer
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
In most cases, it's best to scan books as line art. From there, you can play with the brightness & contrast settings (depending on the scanner) to get a better quality scan, and later use Irfanview or something like it if the pages need more editing.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Scanning in your own books | gazza | News | 125 | 01-24-2016 04:42 PM |
Scanning books - New need help | Sporadic | Workshop | 9 | 04-19-2009 01:11 PM |
Scanning Tips For Thin Paper | Adam B. | Workshop | 4 | 12-20-2008 01:39 PM |
Interesting paper on copyright law vs reality | Nate the great | News | 9 | 11-20-2007 02:20 AM |
Scanning books | Nate the great | Lounge | 10 | 11-04-2007 01:20 AM |