Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > Deals and Resources (No Self-Promotion or Affiliate Links)

Notices

Reply
 
Thread Tools Search this Thread
Old 05-13-2006, 11:38 AM   #1
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,163
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Scanning books from your own library

Branko of Teleread came up with some interesting statistics suggesting that - unlike distributed proofreaders - many of us would love to digitize their own personal libraries.

If you've ever tried to scan a full-length book without having access to a high-end $150k+ scanner, you'll understand why professional proofreaders who deal with books every day are not so fond of the idea of scanning their own content. Manual scanning and OCR'ing is a pain since both tasks are time-consuming and usually prone to errors. Now, as many of you know, Google is working with various major libraries to digitally scan books from their collections so that users worldwide can search them online. But don't expect some poor first-year student to sit all day and night in front of a low-cost scanner flipping pages. These libraries have access to fully automated page-turning and scanning devices that produces high quality digital images of bound materials (nondestructive) at throughput rates as high as 2400 pages per hour.

It'd be great if one day you could just visit a Kinko's outlet and rent a Kirtas scanning device for a short period of time. Only then would I be willing to turn my dusty library into a bunch of e-book.
Alexander Turcic is offline   Reply With Quote
Old 05-13-2006, 12:23 PM   #2
Liviu_5
Books and more books
Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.
 
Liviu_5's Avatar
 
Posts: 917
Karma: 69499
Join Date: Mar 2006
Location: White Plains, NY, USA
Device: Nook Color, Itouch, Nokia770, Sony 650, Sony 700(dead), Ebk(given)
Hi,

My experience with scanning is as follows:

- opticbook 3600 scanner (~250$) with decent ocr included (abby)
- I scan double page, 300 dpi, b&w, tif or pbm (mostly tif but sometimes pbm is easier to manipulate)
- I do 10 pages (5 dp sheets) per minute for hc/tp, 14 p per minute pb and just watch a movie on my portable dvd player when scanning
- pc does the ocr in about 20-30 minutes per book and I just send word and then text since that eliminates most strange characters
- since everything is for my personal use, I do not bother correcting, the software is good enough for the results to be nicely readable (once you get used with several quirks like "die" instead of "the" sometimes)
I have done maybe 20 books and read about 5 fully on my Nokia 770 or Ebookwise 1150, partially from others. Also you can do picture books and embed the scanned pages (maybe transformed to jpg) in html to read on pc/tablet with uBook, though I do it rarely since I do not like reading fiction on pc/tablet/laptop.

Hope this helps,

Liviu
Liviu_5 is offline   Reply With Quote
Advert
Old 05-14-2006, 06:07 AM   #3
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
I have scanned over 250 books from my personal library.

Eyesight problems are my main reason for doing this. I can create an eBook with a larger font size which makes for easier reading on my eBookwise than from the original paper book.

I use an Optibook 3600 or a Canon Lide 60 to scan two pages at a time into Abbyy Fine Reader. After editing with Abbyy for spelling and scanning errors I then send the pages to Word. It is in Word that I arrange for a larger font size and other special formatting for chapter headings etc and removal of page numbers. I save the file as an .rtf file and then convert this to the .imp format required by the eBookwise.

I never save in .txt format because all formatting such as bold, italics etc are lost. Italics in particular are necessary to follow the storyline in some novels because they often represent thought or telepathy etc. Project Gutenberg overcomes this by using all upper case letters for emphasis but I find this distracting.

So, time consuming — yes — but I can usually manage to produce a finished ebook in less than a day and I also find the work very therapeutic and rewarding. This process means that when browsing in my local bookstore I don't have to put most of what interests me back on the shelf because I can't read the text.
Moonraker is offline   Reply With Quote
Old 05-14-2006, 06:25 PM   #4
rmeister0
Addict
rmeister0 has a complete set of Star Wars action figures.rmeister0 has a complete set of Star Wars action figures.rmeister0 has a complete set of Star Wars action figures.
 
Posts: 270
Karma: 298
Join Date: Mar 2005
I just scan the images into a PDF. It takes a lot less time, and OCR errors really bug me for some reason.
rmeister0 is offline   Reply With Quote
Old 05-14-2006, 09:22 PM   #5
Bob Russell
Recovering Gadget Addict
Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.Bob Russell ought to be getting tired of karma fortunes by now.
 
Bob Russell's Avatar
 
Posts: 5,381
Karma: 676161
Join Date: May 2004
Location: Pittsburgh, PA
Device: iPad
Quote:
Originally Posted by rmeister0
I just scan the images into a PDF. It takes a lot less time, and OCR errors really bug me for some reason.
I was wondering myself about how many people just scan the pages and don't bother with OCR because of mistakes. I've tried reading some OCR'd books, though, and it didn't really cause me any trouble because once you get used to the types of mistakes it's pretty easy to figure out what the text was supposed to have been... it's the same kinds of letters and letter pairs that get confused all the time.

The main problem I see with scanning to pdf without OCR is if you want to read on a small screen device or if you need small file sizes. It just wouldn't seem to be useful for mobile reading unless you are using a laptop. Even the new UMPCs might be too small for a scanned book, wouldn't they?
Bob Russell is offline   Reply With Quote
Advert
Old 05-15-2006, 03:52 AM   #6
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,163
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
There's an in-depth article on (the need of) book scanning in the NYT today:

http://www.nytimes.com/2006/05/14/ma...gewanted=print

"In a regime of superabundant free copies, copies lose value. They are no longer the basis of wealth. Now relationships, links, connection and sharing are. Value has shifted away from a copy toward the many ways to recall, annotate, personalize, edit, authenticate, display, mark, transfer and engage a work. Authors and artists can make (and have made) their livings selling aspects of their works other than inexpensive copies of them. They can sell performances, access to the creator, personalization, add-on information, the scarcity of attention (via ads), sponsorship, periodic subscriptions -- in short, all the many values that cannot be copied."
Alexander Turcic is offline   Reply With Quote
Old 05-15-2006, 12:52 PM   #7
Liviu_5
Books and more books
Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.Liviu_5 juggles neatly with hedgehogs.
 
Liviu_5's Avatar
 
Posts: 917
Karma: 69499
Join Date: Mar 2006
Location: White Plains, NY, USA
Device: Nook Color, Itouch, Nokia770, Sony 650, Sony 700(dead), Ebk(given)
Hi,

If you do not (or cannot due to formulas/diagrams) OCR, you can read the images directly with your favourite slideshow software, or embed them in a blank html and use uBook or your favourite pc software reader.
Pdf's take less space true, but unless we get a portable reader that can read them properly (no scrolling or zooming necessary, pdf page to pdf portable device screen - here portable means something I can use one handed and without mouse/pen) size does not really matter since in all of the above ways you read an image at a time so speed is not an issue and actually it is less memory consuming this way than reading a pdf, you just need enough hard drive space for the images.
This is how I read selected pdf's with my Nokia 770, by cutting the pages (through djvudigital and ddjvu) in half (portrait) or 4 (landscape dble page scan), making sure that each image is 800x480, and using lower quality pnmtojpeg to get manageable size (~40 kb/image or 80 kb/page) since the Nokia screen is good enough. The result is very nicely readable, very fast since Fbreader gets an image at a time, though I lose navigation except page by page. But it is worth it since even with evince pdf's are slow and you need scrolling and so on...
Whenever you have a fast html reader that takes embedded images and enough hard memory this method works nicely as long as you cut to screen size and the result is readable (even on Ebookwise it works for most scans with cutting in half and resizing to 318x448), but of course I would rather read the pdf directly and not have to write the scripts to cut and so on...
We have to see but I think that the Iliad may be able to read nicely a portrait pdf scan, though not a landscape scan, while the Sony reader will not be able to do that due to lower resolution. It may read "reflowable" pdf's, but scans no.

Liviu


Quote:
Originally Posted by Bob Russell
The main problem I see with scanning to pdf without OCR is if you want to read on a small screen device or if you need small file sizes. It just wouldn't seem to be useful for mobile reading unless you are using a laptop. Even the new UMPCs might be too small for a scanned book, wouldn't they?
Liviu_5 is offline   Reply With Quote
Old 05-17-2006, 11:21 AM   #8
Steven Lyle Jordan
Grand Sorcerer
Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.
 
Steven Lyle Jordan's Avatar
 
Posts: 8,478
Karma: 5171130
Join Date: Jan 2006
Device: none
I've wondered myself if anyone else has tried to improve OCR by taking a 2-step scanning process... that is, photocopy-enlarging the pages to letter size, then doing the scan and OCR. This has worked for me on small article scans, but I've never gone through the trouble for an entire book.

(Frankly, my head would blow up if I considered digitizing my entire library, and it's not that big!)
Steven Lyle Jordan is offline   Reply With Quote
Old 05-19-2006, 07:23 AM   #9
Gavin
Junior Member
Gavin began at the beginning.
 
Posts: 6
Karma: 10
Join Date: May 2006
Device: TungstenE
Sounds like a useful service might be where one could post a book away and have it scanned and proofed into a format of their choosing.

Gav
Gavin is offline   Reply With Quote
Old 05-23-2006, 06:45 PM   #10
Steven Lyle Jordan
Grand Sorcerer
Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.
 
Steven Lyle Jordan's Avatar
 
Posts: 8,478
Karma: 5171130
Join Date: Jan 2006
Device: none
That sounds like a service that would have to be funded by a non-profit of some kind, because I can't imagine it ever being a profitable venture.

Eh?

Any non-profits interested? Speak up...
Steven Lyle Jordan is offline   Reply With Quote
Old 06-14-2006, 03:03 PM   #11
ath
Addict
ath doesn't litterath doesn't litter
 
Posts: 222
Karma: 110
Join Date: Jun 2006
Location: Malmo, Sweden
Device: iLiad, Sony PRS-505, Kindle Paperwhite & Oasis
Quote:
Originally Posted by Steve Jordan
I've wondered myself if anyone else has tried to improve OCR by taking a 2-step scanning process...
That, I think, depends on what quality you get 'raw' from the scanner. If the scanner is clunky and produces uneven results in low resolution, it probably would work. I've done it for books printed on bad paper or with uneven press-work.

However, with a reasonably modern scanner, capable of real 300 dpi resolution, and OCR software with the functionality of, say, FineReader 8, you don't need it. You'll need to check thresholding levels (unless you go for greyscale) before you start working, and you may have to check for light levels drifting as the scanner gets warm, but apart from that it's rather plain sailing.

In higher resolution and with good print work, the problem more or less goes away. I've done 600dpi work, and had something like one misread per two pages with only one or two pages of training beforehand.
ath is offline   Reply With Quote
Old 06-14-2006, 08:08 PM   #12
Snappy!
Addict
Snappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura about
 
Snappy!'s Avatar
 
Posts: 260
Karma: 4256
Join Date: Feb 2006
Device: SHARP Zaurus C1000
Quote:
Originally Posted by Steve Jordan
That sounds like a service that would have to be funded by a non-profit of some kind, because I can't imagine it ever being a profitable venture.

Eh?

Any non-profits interested? Speak up...
au contraire ... actually I think that *may* just be what is needed to spur the ebook movement ... though I foresee book authors clamping down on it.

Digitization service ... hmmm ...
Snappy! is offline   Reply With Quote
Old 06-14-2006, 09:29 PM   #13
Steven Lyle Jordan
Grand Sorcerer
Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.
 
Steven Lyle Jordan's Avatar
 
Posts: 8,478
Karma: 5171130
Join Date: Jan 2006
Device: none
You think? How much would you be willing to spend to have your $6 book scanned and digitized? $10? $50? $100?

How much do you think you as the scanner would have to charge, to make it worth your while in equipment, time, manpower, etc? $100? $50? $10?

I think that, unless the process becomes much more automatic, faster, and more dependable, few customers will be willing to pay the amount vendors would ask to do the work.
Steven Lyle Jordan is offline   Reply With Quote
Old 06-16-2006, 12:28 AM   #14
Snappy!
Addict
Snappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura aboutSnappy! has a spectacular aura about
 
Snappy!'s Avatar
 
Posts: 260
Karma: 4256
Join Date: Feb 2006
Device: SHARP Zaurus C1000
Quote:
Originally Posted by Steve Jordan
You think? How much would you be willing to spend to have your $6 book scanned and digitized? $10? $50? $100?

How much do you think you as the scanner would have to charge, to make it worth your while in equipment, time, manpower, etc? $100? $50? $10?

I think that, unless the process becomes much more automatic, faster, and more dependable, few customers will be willing to pay the amount vendors would ask to do the work.
You are right on the $6 book. But there are many books, eg tech reference books that costs $60~$100 or even more and could do with such a service ... but yes, its still a niche market ...
Snappy! is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanning in your own books gazza News 125 01-24-2016 04:42 PM
calibre crashes when scanning and adding books oncdoc Calibre 8 04-21-2010 03:03 PM
Scanning books - New need help Sporadic Workshop 9 04-19-2009 01:11 PM
Scanning paper (out of copyright) books. Charles Gray Workshop 18 03-25-2009 02:06 PM
Scanning books Nate the great Lounge 10 11-04-2007 01:20 AM


All times are GMT -4. The time now is 10:51 AM.


MobileRead.com is a privately owned, operated and funded community.