View Single Post
Old 06-29-2010, 10:27 AM   #8
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
When you just scan a book and store it in a PDF, what you are storing are images of each page, similar to if you were to store them as graphic files, such as JPEGs. OCR will "read" those images and convert them to text but, since your computer is not as bright as you are, it will often "misread" the image and make errors. Then you have to go into the new text and correct those errors. Running a massive document like a book through an OCR program is very time consuming. Editing the results is even more time consuming.

Most e-book readers read text better than images. Images are fixed objects that cannot be reflowed and, to make them fit on the screen of an e-ink reader, the readers will usually display them in a reduced size. Text on the other hand, is like each character in the text is an individual image that can be enlarged or shrunk and displayed sequentially across the screen. Most PDF e-books for sale have the content as text instead of whole page images so they can have the characters enlarged or reduced as needed. PDF text, based on what I've read on the mobileread forums, does not always work well with e-book readers so another format, such as epub would be preferable for storing your text.

I'm doing almost exactly what you are doing except (I'm assuming) instead of leaving the books intact and scanning them, I'm cutting off the spines then running the pages through an ADF (automatic document feed) scanner, saving the scanned pages as a PDF, and destroying the original book (as if cutting off the spine hadn't already done so). Because of the sheer volume of books I have to do (over 1100 estimated), I simply do not have time to bother with OCR and the needed editing afterward so have opted not to. The only problem I have had with that is finding a suitably sized e-ink reader that can zoom the page images to a readable size that is also affordable. I haven't had much luck there but the technology keeps improving and the prices keep coming down so it's just a matter of time before I will be able to get the reader I need.

There is a possible legal issue with copying your books and giving them away. Depending on where you live, it is probably illegal. Most (but not all) legal jurisdictions' copyright laws will allow you to legally make a copy of a book for your own use for archival purposes. They will also allow you to change media (again, most but not all; I think the U.K. doesn't permit it). However, when you copy a book and then give away the original, you have essentially stolen the contents of the book because you have deprived the creator of the content (or the current copyright holder) of the potential income they could have derived had the recipient of the book payed for it. Giving away a book without retaining a copy of it is not the same as what you are proposing since giving away the book is merely transfering ownership whereas copying the book and giving it away has created two books with two owners but the creator of the content of the book has been compensated for only one. Even though the content of the copy is essentially an arrangements of 1s and 0s, it is still property (albeit now physically intangible). Making matters worse, copyright laws between nations vary somewhat (or, occasionally, dramatically) and have both not kept up with advancing technology and have been corrupted to go beyond the original intent, which was to protect the authors' interests, to protecting big corporations' essentially eternal stranglehold on new literature.

It's up to you to decide if you want to follow the letter of the law where you live, take the moral route and just not distribute copies, thus not essentially stealing, or do what you jolly well please.
Lady Fitzgerald is offline   Reply With Quote