I have a question it seems to be difficult to find an answer to with my ample googling skills at least: I have a book, like in a bundle of pages made out of dead trees with letters printed on them. What is the best way to make an ebook (epub naturally) out of this?
I will be using a SkyPix TSN410 (a handheld scanner that saves the images to an sd card) as it was a cheap and seemingly good alternative for book scanning. Also, it's main purpose, beside ebook making hopefully, will be to scan a few pages here and there on the campus library. I believe contrast ratios etc. will be quite OK. (Not received it yet).
But apart from the scanning, I would, naturally, want the job to be as easy as possible, without compromising on formatting and quality.
Especially the formatting part seems difficult. The free OCR alternatives out there seems to mainly rip the text out of there without worrying abut formatting. It'd be nice to have as much of the formatting as possible intact, as long as it doesn't decrease the quality of the epub.
Most of what I find, however, concerns already OCR'd text, and how to format this properly. I'd like to keep as much of the footnotes, chapter headings, etc. as original as possible.
Is there any OCR apps out there that are good at this? I'd guess it narrows the alternatives down a bit by the fact that it should preferably run under linux and be free... But as the .jpgs from the scanner is portable, I *could* (but would rather not have to) reboot my desktop pc into Win.
Apart from that, if anyone would like to point to a begnners guide they think are epsecially good, concerning also the process after OCR, that is of course most welome.
From what I have read, people seem to recommend a free web-service which allows you to convert only 30 pages a day if you want to keep formatting from image files. But I hope someone here has a better idea.