|02-10-2010, 03:25 AM||#1|
Join Date: Jan 2010
how to scan a book and make a pdf book?
i just bought a canon scanner LiDE 200. Before i bought it, i thought it was the suitable scanner for me as there were many good review on this scanner. however, when i tried to scan my first book, i noticed that the file of the picture was very large, it was about 1mb a page! and when i compressed it or resize it, the image became so blur even i cant read it. and i found out that a scanned image was very light in kindle dx display as there have been many people complaint on it. can someone solve my problem?
1.how to scan a book in good quality but the file's size is acceptable.(does anyone use the same type of scanner with me?)
2.how to cut the scanned image? if i scan through 2 page in the same time, i would like to cut off the middle part. or there are any software that i can cut the image into 2?
3.how to make the image clear in dx display?
|02-10-2010, 06:11 AM||#3|
Join Date: Oct 2009
Device: Amazon Kindle 2 Intl
I think you want to do OCR on the scanned book to convert the text from an image to text.
|02-10-2010, 12:11 PM||#4|
Join Date: Aug 2007
Device: Sony PRS-500
I have tried Canon LiDE class scanners and found them too slow for book scanning.
If you can cut your books then you should check out Fujitsu Scansnap S510. It's has an Automatic Document Feeder that you can load up 50 pages. It scans 18 pages or 36 images per minute in duplex mode. You have the option to create multi-pages PDF file out of it. It costs around $450.
If cutting is not an option, then you should look for a flatbed scanner. One that have a big area so you can scan two pages at the same time. This will cut your scanning time by half. I currently use a Canon ImageRunner 8070 which costs more than $10,000. It has book scanning features. However, I don't really use them. All I need is a fast flatbed scanner and it does the job fast at excellence quality.
If budget is not an issue, you might want to check out Fujitsu FI-6240. It scan 60 pages or 120 images per minutes in duplex mode. It has a flatbed size of 8.5" x 11.69". It costs about $1,886.03 on Amazon.
To answer your questions:
1) Black and White scanning is sufficient for Ereader. 300 dpi is more than adequate. To reduce the file size, you can lower the scanning resolution. Each scanner have its own steps down or step up in resolution. Highend one like mine have 300, 400, and 600 dpi. Using any more than 300 dpi is a waste of space.
2) Once, you scanned them you should use AABBYY Finereader to crop, split images into two pages, run OCR, and convert them to PDF, HTML, or LIT. This will reduce the size of the book from 200 MB - 300MB to maybe 5 MB PDF. Crop cleans up extra space around the book. Split page will split the image in the middle line. Finereader can automate the process through out the entire files. Hence, the whole process is fully automated. AABBYY Finereader is included with Fujitsu ScanSnap S510.
3) By running OCR, you converse these images into text. This is the best way to make the text clear on DX. If you prefer to just using the image as is without text conversion, then scanning at higher resolution would only help a bit. Finereader have the option to create PDF with "exact format" as the original. With the Kindle DX screen size, I would assume you would get the benefit of "text" pdf and keep the same layout as the orignal scan.
|04-29-2010, 04:45 AM||#5|
Join Date: Jun 2009
Device: Nook touch, iPad, Xoom
My work-flow is,
We have Minolta (I think) photo-copier that has scanning option too. It scans as fast as you photo copy a sheet of paper. So I scan the whole book as a pdf file at 600dpi resolution. It usually creates a huge file. I take that pdf to Acrobat and export it as 600 dpi tiff files. Each page is converted to single tiff file so I usually end up with few hundred tiff files. Store all these tiff files in folder. Import this folder in ScanTailor and thats about it. Scan Tailor will take care of all the things you mentioned. Once whole processing is done, export processed images as 600dpi tiffs. Assemble into a pdf file.
So far this works fine but I usually end up with quite big pdf file. I understand that it doesn't have to be 600dpi but I just don't know whats the best. Should I go with tiff images or png would also work? I don't know. I have tried running OCR on these huge pdfs and downsample it to 150dpi but not all pdf turn out to be good so I don't know if there is any yardstick for it.
I have not looked into pdf file structure and how best it can be manipulated so that the quality of images in pdf is good enough, at very small file size. I am searching for these options and I may modify my workflow accordingly.
|05-06-2010, 08:40 AM||#6|
Join Date: Sep 2007
Device: iphone 4s, ipad 3, kindle touch,kindle PW, kobo mini, kobo aura
hi I know this is a couple of months old but here is my two pennysworth, I used, up until yseterday an epson u7000, which i know isn't the same one as the OP but along the same lines.
I have a copy of Omnipage 15 which i got from ebay, I scan straight into omnipage, double pages, then go throught the pages mark up what i want to OCR, then run OCR on the file and save to .rtf, open the file in word do a quick look over and check formatting. into Calibre make book, it might not be perfect but it is readable, i would like to also point out that this is a learning curve, you work out better ways to do this as you go along, the books i am producing now are miles better than the first books i scanned.
just thought to give you another option to ABBYfine, i haven't used it and i am quite happy with Omnipage 15. I've also just bought a OPTICbook 3600, great scanner so far.
|05-26-2010, 07:04 AM||#7|
Join Date: Jul 2009
Device: u820, asus t91, eee pc
thought I would chime in my method that I have come up with...
scan the pages on a flatbed scanner... 150dpi or 300 dpi depending on the book to jpeg
odds then even pages if I cannot scan them 2 at a time...keep them in separate folders
use advanced jpeg compressor to batch process all the files on reduce file size settings....this cuts the file size down big time for me without loss of quality
then i use pdfmerger to merge the odd pages with the even ones (GREAT LITTLE PROGRAM)
use adobe professional to OCR the file (save as a separate file.. i just add OCR at the end of the filename)
then use adobe professional reduce file size option (i prefer this over optimize function bc it seems to keep quality higher and similar file reduction size) - (same thing add REDUCE at the end of the file name..so now it says...blah ocr reduced.pdf
then i compare all the files by quality, size etc...
original merged scan...the biggest file size
ocr plus reduced...
sometimes there is little difference between ocr and reduced files...so i may keep the ocr if it means i am only saving a few mb... but most of the time i get enough file size reduction to keep the ocr reduced file...
ive only been doing this for a little while...but ive found this works well for me and for what i have for pdf ebooks...
i am looking into making my own tripod digital camera with glass cover ebook scanner set up vs using the flatbed scanner (saw a few tutorials on making one and looks promising)... seems like you can get great quality in less time...
|10-28-2011, 06:35 PM||#8|
Join Date: Oct 2011
Location: North Wales, UK
Device: iPod Touch, Kindle
I am considering trying the whole scanning thing some sound advice for a total boob like me.
|03-24-2013, 03:06 PM||#10|
Join Date: Sep 2010
Device: prs-t1, phone/Cool Reader, tablet/BlueFire, Nook Simple
I get the impression that using a half-decent camera is quite popular for book scanning. Seems like the Internet Archive is using it quite a lot. Have experimented a bit myself, the main problem is getting the book page sufficiently flat so the lines don't curve.
For images, it's simpler to use a scanner, but for text a camera should do, and is certainly very much quicker.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|commercial on-demand book scan service?||miquele||General Discussions||2||12-20-2011 03:53 PM|
|Plese HELP! Trying To Scan A Book V2!||NVash||Workshop||12||09-12-2010 04:28 PM|
|iPod Plese HELP! Trying To Scan A Book!||NVash||Apple Devices||3||04-26-2010 01:06 PM|
|Have you EVER wanted to scan that old book?||HorridRedDog||News||87||04-23-2010 05:10 AM|
|Help: Tips & Tutorials on how to debind, seperate pages & scan a hardback book to PDF||thebigalphamale||Workshop||4||04-17-2010 02:41 PM|