View Single Post
Old 02-20-2013, 03:33 PM   #1
Pumpkin Soup
Junior Member
Pumpkin Soup began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Aug 2012
Device: iPad
OCR to EPUB Best Workflow

Hello! I've never done OCR before, and I wanted to hear what people's thoughts were on the best workflow. I have an idea of my workflow I'd pursue + some questions along the way. Please let me know if you have any thoughts to improve my process (or correct me if I got something horribly wrong or missed a step). Assuming cost was not an issue, do you...

Either
1a) Build a DIY book scanner where you can flip the book and snap photos of pages with a camera (D-SLR, Output as TIFF).
or
1a) Physically remove the spine of the book (ideally with a stack paper cutter for the most accurate cut), scan the cut-out pages in a stack scanner. What should the DPI be? Is it problematic to scan the pages into a combined PDF—or are individual TIFF pages preferable?

2) Take the scanned image/PDF files and run them through ABBYY Finereader (does any version of the software have better features than others?). If you have a complicated book (graphic elements you would like to remove) that you scanned into PDF, would ABBYY PDF Transformer be a worthwhile investment? (I've been rec'd ABBYY products the most, but please let me know if you prefer something else)

Time to export... do you...

3a) Export to EPUB, start working directly inside the ABBYY-created EPUB
3b) Export as HTML, manually take the HTML and create an EPUB from there
3c) Export as a Word document or PDF (assuming you have scripts to make this process easier), and take those files into InDesign to begin making an EPUB, then export a built EPUB from InDesign and continue editing from there.

Then, finished EPUB!

What are your thoughts? What is your preferred route? Thanks.
Pumpkin Soup is offline   Reply With Quote