View Single Post
Old 05-20-2011, 05:30 PM   #6
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Worklog, from a physical book to a finished eReader optimized PDF

5. Worklog, from a physical book to a finished eReader optimized PDF

So here we go, a small explanation on how to digitize a physical book, so it gets perfectly readable on the Sony eReader.

Scan the book
I use a special scanner to scan a book. It’s the so called Plustek OpticBook 3600 Plus. It’s a usual flatbet scanner with the advantage that it has a very narrow border on one edge, … Because that makes it special, you pay a premium for it. But you buy a scanner once, and if you plan to work with digitized books, it’s worth a consideration.
You can also use a normal flatbet scanner, just make sure that you press on the back of the book so there’s no distortion.
I don’t recommend the super flat LED scanners, they work fine as long as the object lies flat on the surface, which isn’t the case on a book. As soon as the object has a small distance they don’t work any longer.
If you use a camera, then it gets a bit more difficult to get good results because the post processing needs additional tools to remove the distortion.
Just select 300 DPI and start the scanning. I prefer to scan each page to a single page first. That way I can postprocess the pages with whatever program I want and in whatever way I want.

Postprocessing the pages
I use Adobe Photoshop. Other free tools should be sufficient, too. We just need to change the white and blackbalance and maybe increase brightness and contrast, all done in a batch process. That’s a very important step, because only that way you get perfect OCR results and true black and whites on the eReader.
In an additional step you can crop the pages already (I do this most often), however, this can also be done in the PDF later.

Creating a PDF file
Just create a PDF file out of the images. I recommend the best quality setting to keep the 300 DPI and keep compression artifacts at a minimum, later we will OCR it thus space consumption doesn’t matter yet.
Crop the PDF / optimize margin
Remove not needed white or black space, if possible to a value which fits on the eReader screen best. Add margin by changing the content size in the PDF page if you want to remove the useless grey margin.

OCR
In Adobe Acrobat select ClearScan and set the image resolution to 300DPI. After that your PDF file size is tiny, the pages are searchable and the text replaced with a vector based font.

Structure
Now we are almost done. You can add an index so navigation gets easier on the eReader.

eReader
Copy the file to the eReader, open it, change the view settings to custom with Brightness -40, Contrast +40, select landscape mode and enjoy your PDF

Summary
You may think that this is a lot of work, but keep in mind, it’s not intended for usual books. It's intended for science books used by students or other people, who work with such a book several weeks, months, years. For a normal book, which you use a few days and then ‘throw’ it away, that’s not worth the trouble, but there you can buy an official eBook most often already.

Last edited by UpSpin; 05-20-2011 at 06:00 PM.
UpSpin is offline   Reply With Quote