I made this book scanner years ago out of scrap wood. I have had a variety of lights and cameras on it, including the somewhat ridiculous looking LED floodlight and old video camera in the picture. But it does the job...I can comfortably scan a page about every 10 seconds.
The V-tray and glass on top of the book keeps it nice and flat...no need to correct for curl or keystoneing or whatever. Resolution is totally up to how I set the camera. 300dpi is usually fine for tesseract OCR.
OCRFeeder is the tesseract front-end I use. I always OCR page-by page to handle things like double or triple columns, advertisements, "continued on page 107" and so on. Also if there is a real scan/OCR problem, I discover it ON THAT PAGE, not later, buried somewhere in 100,000 words.
This gives me jpg images directly, no need to mess with PDF nonsense. I do use ScanTaylor sometimes if the original physical book is horrible. OCR the images, text into Writer for proofing and styling, straight to epub with Sigil or Calibre.
|