Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 11-03-2009, 03:34 PM   #1
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
Question Please suggest scanner/software for 50 novels

I'd like to scan about 50 novels into searchable text. More may follow. I'm thinking a budget of about $1,000 will probably get me the scanner/software I should use, but I haven't been able to figure out what will be easy for the task. I plan to feed the pages through an ADF.

I almost settled on the HP ScanJet N8420 and then think I learned that I would have to touch each page to get the scanner to understand that I want to crop the page to omit the standard header and page number on most pages. It's hard to figure out how the pieces work together by looking at literature on the Internet.

Can you suggest a scanner and software?

I'm hoping the software will also help to remove optional hyphens. I don't need formatting. I plan to run another program against created text files to analyze word and expression usage within the novels.

Thanks for your attention. Please advise if you can.
Jim Thompson is offline   Reply With Quote
Old 11-03-2009, 03:43 PM   #2
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
The OpticBook 3600 seems to be pretty popular as a scanner. For software most use ABBYY Fine.


If I was doing that many books I'd probably look into building one of the holding frame/digital camera setups mentioned in other threads (I'll see if I can find the thread).



EDIT: Here's the stuff I was thinking of...
https://www.mobileread.com/forums/showthread.php?t=13848
http://bkrpr.org/doku.php

Last edited by AnemicOak; 11-03-2009 at 03:46 PM.
AnemicOak is offline   Reply With Quote
Advert
Old 11-03-2009, 06:04 PM   #3
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
Thanks Brian; though since I'm blessed with not needing to preserve the book binding, I want to make this as automatic as possible. Anyone have experience with feeding lots of book-like pages through an ADF?

Someone told me Fujitsu and Kodak probably have the best reputations as scanners, but I'm new to all this and heard that only once and that time from a salesperson whose motives I can't be sure of.
Jim Thompson is offline   Reply With Quote
Old 11-03-2009, 06:21 PM   #4
Rootman
Groupie
Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.
 
Rootman's Avatar
 
Posts: 181
Karma: 478
Join Date: Oct 2009
Device: Android & FBReader
Not to be nosy but is the cost of the 50 novels purchased already as etexts less than the cost, aggravation and time to scan the physical books?

Have you searched for the texts online?
Rootman is offline   Reply With Quote
Old 11-03-2009, 06:41 PM   #5
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,185
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
Fujitsu makes some great ADF scanners, but most of them are professional-production level, which means "insanely expensive." (I think the ones we use at my job retail for ~$5,000.) Kodak has the same problem--the scanners that are top-of-the-line are industrial, not intended for personal use. (They work great for personal use. They're just pricey.)

I'm considering getting a Canon DR-2050C; Ebay regularly has them for under $300. Was otherwise thinking of the 2010C.

If I had a few hundred dollars to put towards a book-conversion lab (and I want to), those would be my first choice--something that scans duplex, in color, up to 600 dpi. I don't want it attached to a printer or fax machine, and I don't need it to scan flatbed as well--I have a small flatbed scanner (Canon LIDE-30) for that.

400 dpi is good for OCR. Ability to *not* scan in color is important. Ability to scan to multipage Group IV tiff is important to me, but some people use different methods and won't care about that.

ABBYY FineReader is *the* software to use; whether you'd be happy with FR 6 (which comes with some scanners) or would want one of the later versions, depends on how comfortable you are learning complex software, and whether you care to convert anything more than novels. For just novels, almost any version will work. For textbooks with images, captions, graphs and so on, you may want more control & options than the cheaper software offers.

I'd love to tell you what keywords to look for for the right kind of scanner, but I haven't found any. Drum scanner, sheet fed scanner, ADF scanner... none of them work consistently.
Elfwreck is offline   Reply With Quote
Advert
Old 11-03-2009, 06:49 PM   #6
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
Quote:
Originally Posted by Rootman View Post
Have you searched for the texts online?
Good thought, but there are many books I need that I cannot find electronically in a form that other software can search and analyze. It's not enough to be able to read or search the book. I need to have software analyze its text.
Jim Thompson is offline   Reply With Quote
Old 11-03-2009, 07:53 PM   #7
Rootman
Groupie
Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.
 
Rootman's Avatar
 
Posts: 181
Karma: 478
Join Date: Oct 2009
Device: Android & FBReader
Talking

Quote:
Originally Posted by Jim Thompson View Post
Good thought, but there are many books I need that I cannot find electronically in a form that other software can search and analyze. It's not enough to be able to read or search the book. I need to have software analyze its text.
Calibre ( http://calibre.kovidgoyal.net/ ) can convert MANY formats to MANY OTHER formats including basic text - which should be what you need.
Rootman is offline   Reply With Quote
Old 11-03-2009, 08:52 PM   #8
CharlieBird
¿Huh?
CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.CharlieBird ought to be getting tired of karma fortunes by now.
 
CharlieBird's Avatar
 
Posts: 349
Karma: 1004526
Join Date: Jun 2007
Location: rural Jalisco
Device: HiSense A7 CC, Fire HD6, Kobo Libra2
Jim, Here is another recent thread you may want to check out:
https://www.mobileread.com/forums/showthread.php?t=58014

d
CharlieBird is offline   Reply With Quote
Old 11-03-2009, 09:28 PM   #9
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
Quote:
Originally Posted by Rootman View Post
Calibre ( http://calibre.kovidgoyal.net/ ) can convert MANY formats to MANY OTHER formats including basic text - which should be what you need.
Thanks. I'll try that and report back. The Calibre site indicates that lots of book types can be converted to text; though, I have seen PDFs that were created in a way that prevents saving as text and I'm guessing Calibre doesn't convert those to text. In addition to pdf, eBooks.com sells ePub, MobiPocket (mobi) and Microsoft Reader (lit).

I'm guessing none all these will be sold in a manner that prevents conversion to text. I know most Kindle files (azw) won't let you convert to text. I don't blame the publishers for wanting to protect their copyright. I just want my software to analyze word and expression usage in the book and erase it.

If I can get some of these to work, it will be much easier than scanning, so any further advice on where to shop or what format to purchase would be appreciated.

Meanwhile, my original scanning question asks for the many books I can't get in ebook format.
Jim Thompson is offline   Reply With Quote
Old 11-03-2009, 09:30 PM   #10
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
[QUOTE=CharlieBird;647067]Jim, Here is another recent thread you may want to check out:
https://www.mobileread.com/forums/showthread.php?t=58014/QUOTE]

Thanks. Checked it out, but still hoping someone will read this thread and say, "done that and this work well..."
Jim Thompson is offline   Reply With Quote
Old 11-03-2009, 09:34 PM   #11
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
Quote:
Originally Posted by Jim Thompson View Post
In addition to pdf, eBooks.com sells ePub, MobiPocket (mobi) and Microsoft Reader (lit).

I'm guessing none all these will be sold in a manner that prevents conversion to text. I know most Kindle files (azw) won't let you convert to text. I don't blame the publishers for wanting to protect their copyright. I just want my software to analyze word and expression usage in the book and erase it.

If I can get some of these to work, it will be much easier than scanning, so any further advice on where to shop or what format to purchase would be appreciated.
FWIW ebooks.com is one of the most expensive places to shop. As far as the books they and others like Books On Board, CyberRead, etc. sell most will have DRM protection which would need to be removed to convert them to TXT. Adobe (ePub and PDF), Mobipocket (and Kindle AZW), MS Reader and eReader have all had their DRM method broken so that the purchaser can remove the DRM.
AnemicOak is offline   Reply With Quote
Old 11-03-2009, 10:04 PM   #12
Rootman
Groupie
Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.Rootman has a complete set of Star Wars action figures.
 
Rootman's Avatar
 
Posts: 181
Karma: 478
Join Date: Oct 2009
Device: Android & FBReader
I hesitate to add that converting a physical book to etext may also violate the publishers and authors copyrights as well. I do not beleive that the "fair use" doctrine applies to text scanned from a copyright source.

Not knowing what country you are in and what texts you want of course YMMV
Rootman is offline   Reply With Quote
Old 11-03-2009, 10:55 PM   #13
guyanonymous
Guru
guyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud of
 
Posts: 692
Karma: 27532
Join Date: Dec 2007
Device: Ebookwise 1150 / 1200
If you don't mind destroying the books...that is to say, separating each page from it's binding, you might consider the Fujitsu Scansnap.

The S1500 model is the one I have and enjoy. I've tried it on a lot of different sizes of paper with much success. Though I don't think I've tried the paper type found in a pocketbook, I've done other paper of similar size just fine.

It scans in b/w and colour (decides) and one or both sides of the page at about 20ppm. It also decides which combo of the above to use as it goes, though I do think it defaults to b/w or colour based on the first page.

You are limited to 50 sheets in the feeder at one time, but it's easy to add more as you go. I've successfully scanned 350 pages in one go by adding them in 30-40 page chunks as the number of pages remaining in the feeder dwindles.

The scanner also takes up only a little more space than a loaf of bread when closed up.

I've got nothing but good things to say about mine.
guyanonymous is offline   Reply With Quote
Old 11-04-2009, 06:17 AM   #14
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,450
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Jim Thompson View Post
... I have seen PDFs that were created in a way that prevents saving as text and I'm guessing Calibre doesn't convert those to text.
For converting pdf files to text, use OCR software.
Quite a few of then will take pdf file as an input format. Works great.
I personally use Readiris Pro that came bundled with a cheap HP scanner/copier/printer/fax combo.
kacir is offline   Reply With Quote
Old 11-04-2009, 10:22 AM   #15
Jim Thompson
Member
Jim Thompson began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Nov 2009
Device: iPhone
Quote:
Originally Posted by Rootman View Post
Calibre ( http://calibre.kovidgoyal.net/ ) can convert MANY formats to MANY OTHER formats including basic text - which should be what you need.
Been encountering frustrations and am sharing them for the sake of anyone else as uninformed as me:

I tried purchasing a lit file (microsoft reader) because Calibre's documentation indicates that is it's their easiest file to translate. After I paid for the lit file, I was informed that I would need to install Microsoft Publisher on my computer before it would download. Half way through the Microsoft Publisher install, I was informed that I needed to create a Microsoft Passport Account to activate the software. After doing all of that (and approving installation of Active-X controls I don't understand or want), the lit file still would not download and purusing help screens indicates that Microsoft Publisher can only work on a hand-held device; i.e., not on a desktop computer. I contacted tech support at the eBook publisher to aks if I would have more luck with MOBI or EPUB. That's where things stand currently. I'll update again when I learn more.
Jim Thompson is offline   Reply With Quote
Reply

Tags
adf, crop, dehypenization, ocr, scanner


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Suggest First contact SF novels and more Verner Vinge like authors please rollercoaster Reading Recommendations 51 08-27-2010 01:13 PM
What would you suggest for HTML->epub? radius Workshop 9 07-25-2010 06:48 AM
"Online Novels" - FREE, legal novels available on the Internet Dr. Drib Deals and Resources (No Self-Promotion or Affiliate Links) 8 05-22-2009 09:32 PM
Suggest a Story (Round 1) Moejoe Writers' Corner 110 05-17-2009 10:18 PM
Suggest your own eBook Reader dj_modus_ponens News 27 12-03-2007 03:58 AM


All times are GMT -4. The time now is 03:12 PM.


MobileRead.com is a privately owned, operated and funded community.