Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-10-2012, 07:00 AM   #1
Cedric48
Junior Member
Cedric48 began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jan 2012
Device: none
ABBYY Finereader & Epson Scanner Problems

I am having problems which I believe Epson should assist me with, but they just blame Finereader and wont help me at all.

My Epsn V500 scanner came with Finereader v6. I am scanning several books of 80-year-old carbon copies of correspondence, using the options offered by the Epson software to save in PDF and "create searchable text" (the latter is the default for save to PDF). I scan 10 pages, which apparently are held in temporary image files, then choose save. That triggers an OCR process and the result is a PDF displaying the original page images, but the PDF document has text underlying these images. No dialogue options are presented during this OCR & save process - all the dialogue occurs prior to commencing the scan, and preview and edit optios during scanning. Considering the old typewriter carbon copies with variable clarity of characters, the first 100 pages of this process gave surprisingly good results - most text was recognised.

But now increasingly several pages in a document of 10 pages will be saved in landscape orientation, despite the setting of portrait. Of course the OCR process fails to recognise charaacters that are rotated 90°.

Epson claim it is Finereader that is doing this page rotation, hence they won't offer support. Does anyone know if it is the Epson scanner software that is rotating the image, or is it Finereader?

And does anyone know how to avoid this problem please?

My scan settings are grayscale, 300dpi, page size: actual image size; Orientation: portrait; Page number: save file with all pages; compression level: standard; Text setting: Yes; and I nominate a prefix, a starting number and directory to save it to. I am attaching a successful scan.
Attached Files
File Type: pdf Example.pdf (820.7 KB, 567 views)
Cedric48 is offline   Reply With Quote
Old 01-10-2012, 08:58 AM   #2
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
FineReader 6 is very old software. FineReader 10 came out in 2009, version 11 is the latest right now. I just want you to know that "one button" solutions and exporting straight to PDF right from the scanner is a bad idea. Instead, here's what I'd suggest using Scan Tailor, FineReader 11 and Adobe Acrobat X (even though they'll initially take up a lot of space):
  1. scan as TIFF or PNG, 300 dpi, grayscale
  2. run them through Scan Tailor, output mode set to Color/Grayscale + White margins + Equalize illumination (or see the bottom note if search accuracy isn't terribly important) and sort the pages by width and height to get the odd ones out, and match each other well. Try to get the chapter titles aligned too, so that they don't start from the very top (by default)
  3. first change the settings in FineReader to use the original images instead of applying compression - because we'll be applying compression soon and compressing an ALREADY compressed image would make the artefacts from the first compression pop even more (not to mention if the original scans were JPGs instead of TIFF... then FineReader would apply compression, Acrobat too on top of that... it'd be like a triple kick in the groin)
  4. drag the Scan Tailor-processed images in FineReader 11 and press the Read button (by default it does this automatically)
  5. export as PDF
  6. change the Adobe Acrobat X settings to compress the images to your liking (here you need to have some general knowledge how image compression works) and save as Reduced PDF, then Optimized PDF
This is the quick and dirty method. The quality method would be to proofread in FineReader, export as .docx (as "Formatted Text"), track down the fonts, spend time vectorizing the covers and any other graphics (figures, graphs, charts, cartoon-ish drawings, etc), do the layout in Word 2010, proofread the final product. This takes a significant more amount of time but the output quality is usually of very high quality. It's a pleasure to read such a book.


Note: Using the "Black and White" output method from Scan Tailor will produce much smaller files at the possible expense of OCR accuracy (FineReader already has its own filtering method, meaning that too much post-processing could interfere with the recognition process). Depending on how much you want or need the document to be search-able you could very well go with Black and White. I'd suggest using the settings from #2 and proofread the document in FineReader to have the best of both worlds (cleaner fonts and accurate searches).

Last edited by DSpider; 01-10-2012 at 09:06 AM.
DSpider is offline   Reply With Quote
Advert
Reply

Tags
finereader, pdf conversion


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Need help with Abbyy Finereader 10 (linebreaks) NASCARaddicted Workshop 11 01-19-2017 04:10 PM
ABBYY FineReader Sale anamardoll General Discussions 15 02-20-2013 11:25 AM
Abbyy FineReader Dictionaries Mebyon Workshop 2 02-10-2010 02:57 PM
ABBYY FineReader cannot see images chinesealbumart Workshop 8 05-15-2009 11:03 PM


All times are GMT -4. The time now is 09:44 AM.


MobileRead.com is a privately owned, operated and funded community.