|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jul 2015
Device: Kindle, and others
|
![]()
Hi,
Newbie with a varied print library, but first mostly interested in scanning 20-60 year-old softcover fiction books that may not last forever. My home equipment is and will remain inadequate for some time, so looking into commercial services. From what I see the two "lowest cost" scan services mentioned here repeatedly - 1dollarscan.com and bookscan.us - both offer a basic scan to PDF with OCR/text overlay (before adding more costly prep and conversion options). I've noted threads on problems with further conversion to eBook formats - perhaps my forum search-terms have been inadequate - but no comment on simply mailing such an OCRd-PDF to a Send-to-Kindle e-mail address for conversion. Does that conversion not work as well (or as poorly) as other options? Thanks. |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Scanning is easy. OCR is easy. The part that comes next is hard and takes time. That is fixing the OCR errors and creating a good book out of it. The scanning and OCR can be largely automated and can be offered low costs. The rest is very much 'you get what you pay for'. So good quality means bigger bills. The end format is actually not that important, ePUB of Kindle. For a good Kindle book you also need a good ePUB (or so I am told).
PDF is absolutely the worst format to create an ePUB or Kindle book from. Most of these 'services' use a variant of Calibre to create their conversion, most without any form of post processing. Personally I am not to fond of the Calibre conversions. Mind you, that is my personal opinion. Calibre is a great product in itself. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jul 2015
Device: Kindle, and others
|
Appreciate the comments. As a Noob I'm still absorbing it all.
Yes, I was wondering why the package of PDF with OCR text overlay was "standard" rather than an image file and a text file. Although OCR/PDF conversion issues help explain the "experience" of reading an out-of-print out-of-copyright Kindle book an unknown seller offers for $0.99. My miniscule past OCR experience has been single-pages on a flatbed scanner with either OmniPage back in the 20th Century, or Acrobat Pro later. Guess I was hoping that OCR software had gotten much smarter. If I were a touch typist it always would have been faster than locating and correcting all the OCR errors - every time. Now that I think of it, wouldn't be surprised if manual re-typing would be cost-effective for the big-guys taking advantage of exchange-rate and labor costs in other countries. Last edited by scanewbie; 07-19-2015 at 08:39 PM. |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Oh, OCR software has gotten a whole lot smarter since you worked with it. The error rate is way down, but there always will be typical OCR errors. Also GIGO plays a big role here. The better the source, the better the results. The main OCR player nowadays is ABBYY Finereader.
Re-typing is not cost-effective. It will cause other errors yet again, which will also be spotted only by proof-reading. It is not without reason that I made my Word add-in. It is designed to take the output from the OCR process and either fix errors automatically or give you the tools to fix them. It saves me an enormous amount of time in digitizing a text. The PDF with OCR text overlay is useful. I use it as well. If I find some strange text where I think there is an error but I am not quite sure what it should be, I use that one. It enables me to search quickly to the correct point and then see the original. |
![]() |
![]() |
![]() |
#5 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Jul 2015
Device: Kindle, and others
|
Thank you again for sharing your expertise. I've taken a quick look through the links to your website, and definitely will spend more time there to understand the capabilities of the useful tools you've created.
|
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Send epub to Kindle but don't keep the converted file? | Toxeus | Library Management | 5 | 10-03-2012 08:25 PM |
Book scan -> pdf -> Kindle Touch - problems | rainsparade | 4 | 05-29-2012 01:55 PM | |
cleanup post scan PDF file | wastewater | Workshop | 1 | 01-23-2012 10:43 AM |
commercial on-demand book scan service? | miquele | General Discussions | 2 | 12-20-2011 02:53 PM |
How to convert an OCR file to a Non-OCR one | res9282 | 1 | 08-05-2011 05:58 AM |