Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 04-26-2010, 12:28 PM   #1
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Plese HELP! Trying To Scan A Book V2!

Im new here and very uneducated about scanners and everything else. My area of expertise is video games so Im a fish out of water. But nonetheless I do alot of reading on my iTouch and Droid. I have a few older books that Id like to scan and have tried to but had no luck. I use a Macbook Pro and have an HP Scanner / Printer. Not quiet sure on the make and model but its about two or three years old. On my Macbook the apps I can open are HP Photosmart 3100 Series, HP Photosmart Studio and HP Device Manager. Usually I try to scan through Photosmart Studio. I dont know if that matters or not. But Ive said it many times, using an HP scanner on a Macbook could be my first problem.

See, the Droid uses this program called Aldiko. Aldiko only uses EPUB. I use Calibre to convert PDF to EPUB. Stanza for the iTouch does LIT or EPUB. I prefer to have the same book on both devices because I dont always carry my iTouch but I always carry my cell phone. Just remember what chapter Im on and continue, thats the way Id prefer to do it.

Frankly, if at all possible, Id prefer to scan it in LIT format. I dont know if that is possible, thats Microsoft and I use a Macbook Pro. When I scan though I try to do it in PDF since then I can immediately put it through Calibre, convert it and put it on both devices. Never works. I have a few problems and some questions.
1) The pages come out ridiculously huge, slightly yellow and a very large file size. I may end up with a book thats a GB or so file when its only 200 pages.
2) As Im scanning Id also like to add JPEGs but have no idea how. I have a friend thats an author, she recently put out a book but doesnt know how to scan. Id like to make her an EBook version to sell and add all sorts of Bonus Content such as pics from the movie, musical and even of her new book if at all possible but I cant find the option to add pics to the file.
3) I actually tried to scan a full script once. The book I only got to page 03 before I quit. The script I made it to page 50, I got an Error message, the program shut down and none of it saved so I had to start all over again. That happened three times and then I quit. I think I had gone through HP Photosmart Studio then as well.

Now my questions...
1) How do I scan these books?
2) The books, do they need to be taken apart or can I actually keep them in good condition?
3) Exactly how long might it take per book?
4) What format should I scan them in so that they can be transferred to LIT and EPUB easily?
5) I have no idea where my original software that came with my printer is, I have since moved and may have lost it. Would that be a problem?
6) Ive read about OCR, whats that?

Can someone please help? I cant even begin to say how frustrating this is. I was directed here from posting this same thread over in the iPod forum and also directed to this thread here...
http://www.mobileread.com/forums/showthread.php?t=9780
... which Im currently reading but any more help would be greatly appreciated.
NVash is offline   Reply With Quote
Old 04-26-2010, 04:12 PM   #2
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 547
Karma: 1121392
Join Date: May 2008
Location: USA
Device: Galaxy Nexus
It sounds like you need a basic understanding of the process before you worry too much about details of equipment and the final destination filetype.

OK, what you need to do, in brief, is 1) get pictures of your book pages - this is where the scanner comes in, and is probably 5% or less of the work, 2) "recognize" the letters and words in the image (OCR) and convert them to a text format that can be edited, searched, "read" by other programs, and converted to other text formats, 3) clean up, spellcheck, and proofread the file, 4) convert the file to your final format of choice, and 5) go back and fix all the problems that you missed before or that just popped up (this is practically inevitable).

You're at Step 1, and giving that file to Calibre (Step 4) is sort of like trying to feed a man a picture of a sandwich.

As far as specific tools, only a sheetfeed scanner requires removal of the binding. Your other options are a flatbed scanner (there is a specialized flatbed for books called the Opticbook, but ordinary flatbeds work), or digital photography. You can scan with whatever software your scanner came with and save page images in any format your OCR software will accept. You may have gotten an OCR package with your scanner - some even come with a basic Abbyy Finereader version called Sprint. If not, you'll need to buy an OCR package (I recommend Finereader). Once it has "recognized" the text, Finereader can save it to numerous formats, but quite a few of us like to save to Word because of its excellent search-and-replace capabilities; you can also use its spellchecker instead of FineReader's if you prefer. For final conversion, you've already discovered Calibre, and there are other specialized tools out there too.

Exact details of formats, software packages, and workflow details vary greatly from one person to another, so this is a very rough guide. It is important to remember that scanning is the easiest part of the job - some equipment is faster than others, and/or may give fewer OCR errors due to superior image quality, but there will be errors, and they will require painstaking, nitpicking, laborious proofreading. I can produce a book a week if I spend all my spare time on it, and even then it's not good enough to show anyone else, even if copyright law permitted.

Hope this helps.

P.S. One thing that may be causing confusion is that there are such things as "searchable PDFs", which contain information about the actual letters and words represented. That's usually because the PDF was made from a file that already had the information. With a nonsearchable PDF, your computer has no way of knowing whether the image represents War and Peace or a snapshot of your cat.

Last edited by wayrad; 04-26-2010 at 06:39 PM.
wayrad is offline   Reply With Quote
 
Enthusiast
Old 04-26-2010, 05:51 PM   #3
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Okay, what format do I tell it to scan in? I start with some raw scans right? I usually choose PDF but they told me I was wrong.

http://finereader.abbyy.com/
Checked out Finereader. Wow, $100?
NVash is offline   Reply With Quote
Old 04-26-2010, 05:55 PM   #4
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,520
Karma: 2199070
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
I usually scan to PDF then use ABBYY to get an editable PDF file. Then I use ABBYY to convert to a doc. The doc usually requires a lot of proof-reading.
Patricia is offline   Reply With Quote
Old 04-26-2010, 06:24 PM   #5
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Okay, Im wandering the board too much.
http://www.blueleaf-book-scanning.com/
http://www.diybookscanner.org/
I know this because I just snatched up these links. I dont think Id like to ship my stuff away, but would I need something complicated like the set ups on the DIY site? I just pulled my printer out and got a look at it.
http://www.amazon.com/Photosmart-C31...2320591&sr=8-2
This is what I have. Can it do the job Im trying to accomplish here or do I need to purchase something else? I know for a fact I dont have any sort of Auto Feeder so Im wondering if I even have the right tools to start with.
NVash is offline   Reply With Quote
Old 04-26-2010, 07:02 PM   #6
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 547
Karma: 1121392
Join Date: May 2008
Location: USA
Device: Galaxy Nexus
The first thing Finereader does is convert the page images to .tiff anyway, so what file format you do the initial save to is a matter of personal preference. I use .jpg myself, others like different formats. PDF works too - Finereader will accept it and perform OCR just fine. Using PDF isn't "wrong", it's just that you''ll have to OCR it before going further, same as with any other format you use to save page images.It has nothing to do with your choice of a final format for the book (whether PDF will work for that depends largely on your choice of reading device).

Any scanner will work, and all have advantages and disadvantages. Sheetfeed scanners are fast and you don't have to worry about page curvature messing up your image and making recognition difficult. (However, Finereader can process the image and correct for this sort of distortion, to a degree.) Flatbed scanners are slower (although the Opticbook will do 5-6 pages per minute), will have some distortion due to page curvature, and may require pressure on the spine, espcially if "gutters" are narrow, but you don't have to cut up the book. Digital cameras are reported to work well, often in conjunction with some kind of home-built frame to hold the camera and the book. Most or all of the rigs you will encounter will belong to one of those three types.

Speed is nice, but not as important as you may think - the amount of time saved by a fast scanner is only a tiny fraction of what it takes to make an ebook. More important is that the machine scan at at least 300 dpi - that is often enough, but 900 dpi capability may sometimes be handy. You don't need to spend huge amounts of money to get this capability (actually, anything you buy these days will have much better resolution than that). If you buy a new scanner, make sure it comes bundled with a highly rated OCR program. That may be enough for your OCR needs and, if you do outgrow it, will qualify you for upgrade pricing on Finereader Professional Edition.

P.S. I am partial to the Opticbook 3600 because it's optimized for nondestructive book scanning, and provides what (for me anyway) is a good balance of speed, quality, and price, even though it's not the best in any one of those categories. It does have disadvantages - non-book scanning performance is suboptimal, and customer service and tech support are reported to be disappointing, to put it mildly.

Last edited by wayrad; 04-26-2010 at 08:29 PM.
wayrad is offline   Reply With Quote
Old 04-29-2010, 12:00 AM   #7
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Thanks. Do I have to worry so much about the pages coming out crooked or miscolored? As I said, they come out yellow for some reason and also Im sure a few may be crooked since Im not tearing it apart. Would that be a problem or will ABBYY fix that as well?
NVash is offline   Reply With Quote
Old 04-29-2010, 07:18 AM   #8
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 547
Karma: 1121392
Join Date: May 2008
Location: USA
Device: Galaxy Nexus
Finereader (at least the full version, I don't remember about Sprint) will straighten lines as part of the OCR process (it can even deal with an upside-down page!). Setting your scan to grayscale (some use black & white; see which gives you the fewest OCR errors) will reduce file sizes; color probably doesn't hurt but is a waste of resources unless you need it for an illustration.

If you don't have book-optimized presets in your scanner software, you'll likely need to do some playing with the scan settings to get the best OCR results, so I wouldn't advise accumulating a huge backlog of scans before acquiring the OCR software.

Last edited by wayrad; 04-29-2010 at 07:43 AM.
wayrad is offline   Reply With Quote
Old 08-31-2010, 11:11 PM   #9
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Its been a long time and Ive lost my scanner. I found ABBYY and see that it also works with JPEG. This may be a silly question, if so please forgive me, but is it possible to just take pictures of the pages then put them into ABBYY and kind of use that as a cheap version of a scanner?

Last edited by NVash; 09-01-2010 at 02:37 AM.
NVash is offline   Reply With Quote
Old 09-01-2010, 04:17 AM   #10
Iain
Enthusiast
Iain began at the beginning.
 
Posts: 49
Karma: 14
Join Date: Jul 2010
Location: Harrogate, England
Device: iPad
Abbyy and photos

Abbyy says it is (IIRC) - it has a tool which will automatically pre-process photos.

I've not used this and the concern I would have is that the resolution may be lower than with a scanner. 300 dpi amounts to 1500 x 2100 for a paperback (more or less) so you would need a 3 megapixel camera for a paperback and around a 10 mega pixel camera for A4.

Iain
Iain is offline   Reply With Quote
Old 09-01-2010, 12:44 PM   #11
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
A 3 mega pixel is no problem, even my cell phone camera is better than that. Thatd be for paperbacks which is just about all I intended to scan. Things are looking good so far.

Whats an A4? A hardcover?
NVash is offline   Reply With Quote
Old 09-01-2010, 05:03 PM   #12
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
Quote:
Originally Posted by NVash View Post
A 3 mega pixel is no problem, even my cell phone camera is better than that. Thatd be for paperbacks which is just about all I intended to scan. Things are looking good so far.

Whats an A4? A hardcover?
A4 is a standard paper size usually used in countries that use only the metric system and is around 8 1/4" x 11 5/8", roughly the same as letter size (8 1/2" x 11").
Lady Fitzgerald is offline   Reply With Quote
Old 09-12-2010, 03:28 PM   #13
NVash
Wandering Vagabond
NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.NVash ought to be getting tired of karma fortunes by now.
 
NVash's Avatar
 
Posts: 282
Karma: 350000
Join Date: Apr 2010
Device: iPod Touch
Thanks, Ill have to try this and see how it works out.
NVash is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
how to scan a book and make a pdf book? kawaisoonano Workshop 9 03-24-2013 02:06 PM
commercial on-demand book scan service? miquele General Discussions 2 12-20-2011 02:53 PM
iPod Plese HELP! Trying To Scan A Book! NVash Apple Devices 3 04-26-2010 12:06 PM
Have you EVER wanted to scan that old book? HorridRedDog News 87 04-23-2010 04:10 AM
Unpaper 1.1 book scan post-processor Alexander Turcic News 3 07-07-2009 03:01 PM


All times are GMT -4. The time now is 10:36 PM.


MobileRead.com is a privately owned, operated and funded community.