![]() |
#1 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Nov 2011
Device: PRS-T1
|
best HW + SW to convert books to epub
Hi,
what's currently the best Hardware and Software to scan books and to convert them to epub with little effort ? The books have no fancy layout, no images, just plain text and the books may be taken apart. |
![]() |
![]() |
![]() |
#2 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Moved to the "Workshop" forum.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
If you really mean "best" as in "price is not a consideration", then you should be looking at a device such as this:
http://www.imageaccess.de/?page=Scan...V2Professional This is the type of device that professional scanning bureaux use. If price is a consideration, and you can destroy the books, then a scanner with an automated sheet feeder is probably what you want to be looking at. Something like this: http://www.amazon.co.uk/Fujitsu-Scan.../dp/B001VGJ7JM Last edited by HarryT; 02-01-2015 at 08:18 AM. |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
|
![]() |
![]() |
![]() |
#5 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
A decent OCR program like Abbyy FineReader will give you pretty good results (as in perhaps an error per page), and that may be acceptable for casual reading purposes, but you are of course right in saying that proof-reading is essential if you want an error-free book.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Nov 2011
Device: PRS-T1
|
that's amazing: even for simple layouts and plain text there is a remaining error rate of one error per page ? What kind of errors are these ? Can they be corrected with a spell checker of a decent office program?
|
![]() |
![]() |
![]() |
#7 | |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
A decent OCR program has an accuracy rate of better than 99.9%, but a typical page has around 2000 characters on it, so that means about 2 character errors per page. Some of these the OCR program's spell-checker will fix for you, but some it will get wrong. |
|
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
It really depends on the source. Some errors can be found by a simple spell checker, but some not of course. That is one of the reasons I started to create my tools to remove a lot of OCR errors. It can be spelling, but also punctuation that is going wrong. Not to mention styling an others. A lot cannot be found with the standard tools.
The better the source (and scan), the better the results of the OCR program. ABBYY is doing a good job is the scan is good. |
![]() |
![]() |
![]() |
#9 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
|
Quote:
Also, the error rate for OCR software drops significantly when working from scans of books in poor condition (e.g. foxing, stains, yellowing, ...), printed poorly, and/or printed on poor quality paper. |
|
![]() |
![]() |
![]() |
#10 |
Surfin the alpha waves ~~
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 26,281
Karma: 459765791
Join Date: Dec 2010
Location: New Jersey
Device: Jetbook Lite & Mini, Nook STR, Kobo, Hanvon N516, Kindle 2, Androids
|
Also, slight imperfections in the paper -- dark spots, a stray fiber, etc. -- can be mistaken for punctuation marks like periods and commas.
|
![]() |
![]() |
![]() |
#11 | ||
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
This part, from rumpumple1: Quote:
Hitch |
||
![]() |
![]() |
![]() |
#12 |
Obsessively Dedicated...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,221
Karma: 35037583
Join Date: May 2011
Location: PA {back in the usa!}
Device: Sony PRS-T2, ADE on PC
|
Unless, of course, you forego OCR entirely, and simply go with images of the printed page only. Of course, that way you lose the capability of reflow, search, annotating, all the things you can do with text that are not possible with images. A mighty gloomy result, I think. (There are quite a number of epubs out in the wild that are made like this, recognizable before reading by their HUGE file size. They are really like a pdf in disguise.)
Last edited by GrannyGrump; 02-06-2015 at 03:12 AM. |
![]() |
![]() |
![]() |
#13 | |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 563
Karma: 403106
Join Date: Aug 2014
Device: PRS-T1
|
Quote:
Since most people are different, they have different needs and perceptions, and therefore there are zillions of TVs, computers, cars, etc. because every single human wants a feature more than another feature (like colour red for cars ![]() |
|
![]() |
![]() |
![]() |
#14 | |
Obsessively Dedicated...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,221
Karma: 35037583
Join Date: May 2011
Location: PA {back in the usa!}
Device: Sony PRS-T2, ADE on PC
|
rumpumpel1 queried:
Quote:
A *grammar-checker*, such as Microsoft Word provides, can help to some extent when it recognizes that a word is blatantly wrong for the containing sentence. Unfortunately, it is far from perfect still; and mostly concentrates on punctuation errors. To get good results, you will have to physically proof-read the scan results, comparing against the printed page. Last edited by GrannyGrump; 02-06-2015 at 04:15 AM. |
|
![]() |
![]() |
![]() |
#15 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
An example of this is that I'm currently proof-reading the "J G Reeder" detective stories of Edgar Wallace for the MR library, and a speech affectation of his is to use the word "um" a lot to indicate pauses in his speech. On at least half the occasions in the PG text I'm proofing from, "um" is spelt "urn".
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Want to automatically convert books to epub on addition | pinky62 | Library Management | 1 | 11-01-2014 03:58 AM |
How well do comic books (CBRs/CBZs) convert to ePUB? | mcandre | Conversion | 1 | 12-15-2012 08:27 PM |
Convert Kindle books to Epub? | polli | Amazon Kindle | 21 | 03-23-2012 09:00 AM |
Convert DRM books to Epub/other | tajreed | General Discussions | 6 | 03-31-2010 06:27 PM |
how to get epub/fb2 books or convert best option | M9x3mos | ePub | 2 | 02-19-2009 12:13 PM |