MobileRead Forums - View Single Post - Any suggestions for workflow improvement?

crackhammer · 03-09-2012, 03:11 PM

Hello folks,

My current workflow is includes following steps
- Scan on Epson Perfection V300 Photo (as 300 dpi color tiff image)
- ScanTailor to trim and prefect the scanned images (export as 300 dpi tiff image)
- Acrobat to assemble into pdf and OCR (as searchable image)

I understand that Acrobat OCR is not the best but my primary goal is to be able to highlight the text. I have hardly ever copied and pasted scanned book text anywhere for any purpose so Acrobat does the job for me.

What issue I face sometimes is the large file size of the output file. Does anyone have any recommendation in change of workflow so that I get optimum file size with a good quality image? I searched up and down Acrobat forums, tweaked options but didn't yield much, so I thought may be I should play with the input files but I don't have much clue on images so I am asking question here.

(P.S. - I installed tesseract-ocr from google codes but couldn't figure out how to use it, any idea? Hope doesn't need knowledge of programming)

03-09-2012, 03:11 PM	#1
crackhammer Enthusiast Posts: 47 Karma: 10 Join Date: Jun 2009 Device: Nook touch, iPad, Xoom	Any suggestions for workflow improvement? Hello folks, My current workflow is includes following steps - Scan on Epson Perfection V300 Photo (as 300 dpi color tiff image) - ScanTailor to trim and prefect the scanned images (export as 300 dpi tiff image) - Acrobat to assemble into pdf and OCR (as searchable image) I understand that Acrobat OCR is not the best but my primary goal is to be able to highlight the text. I have hardly ever copied and pasted scanned book text anywhere for any purpose so Acrobat does the job for me. What issue I face sometimes is the large file size of the output file. Does anyone have any recommendation in change of workflow so that I get optimum file size with a good quality image? I searched up and down Acrobat forums, tweaked options but didn't yield much, so I thought may be I should play with the input files but I don't have much clue on images so I am asking question here. (P.S. - I installed tesseract-ocr from google codes but couldn't figure out how to use it, any idea? Hope doesn't need knowledge of programming)