Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 01-26-2012, 03:50 PM   #1
1611mac
eWanderer
1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.
 
Posts: 520
Karma: 1441998
Join Date: Jul 2010
Location: NC, USA
Device: iMac,iPad3,iPhone5-Kindle Fire,Touch,PaperWhite
Report on Abbyy FineReader OCR Software w/ Canon Lide 60

SubTitle: In dealing with Books, it's not the scanner that's important, it's the OCR SOFTWARE!

I have experience creating ebooks (pdf, mobi, epub) but my source has, to date, been nice text files supplied to me by clients. I got the urge the other day to try creating suitable text by scanning a pBook with my old Canon Lide 60 scanner.. so....

I obtained a copy of ABBYY FineReader Express for Mac and installed it on my iMac ( Lion - OSX 7.2)

I fired up ABBYY FineReader and within minutes I was scanning. In less than 20 minutes "learning time" I was scanning multiple pages (Cut from book). The software is intuitive enough that I did not have to consult instructions or help.

With the Lide 60 it takes about 11-12 seconds to scan a 5.5x8 page. After the initial setup adjusting the scan you can just hit "scan" for every page and the scanned image for each page appears in a pane (see attached pic).

Scan as many pages as you like and then hit the "convert" button in FineReader Express and the scan is OCR'd to text. The output from the OCR conversion is a single RTF file. My six page scan turned out to be 2066 words. I then opened the RTF file in Pages (Apples Word compatible text processor) I have yet to find one error and I've looked at the entire text.

The text even had proper paragraph indents and also proper BlockQuote indents!. (left and right margins/padding)

Perfect multipage scans loaded to my favorite text editor with almost no "learning curve" at all! Lesson: A Good OCR program is worth the cost!

Attached is a screen shot of FineReader (Mac version) with six pages scanned. (Pic is reduced 40%)
.
.

Last edited by 1611mac; 01-26-2012 at 03:52 PM.
1611mac is offline   Reply With Quote
Old 01-27-2012, 05:00 AM   #2
Iain
Enthusiast
Iain began at the beginning.
 
Posts: 49
Karma: 14
Join Date: Jul 2010
Location: Harrogate, England
Device: iPad
Fine Reader

Fine Reader is indeed a good piece of software.

However, the results depend greatly on the source material. My scans (around 500 books) have been of normal format paperbacks and the quality ranges from near perfect to near unreadable.

There are a number of errors which happen frequently, especially with small fonts 'I' being replaced by '1', 'tl' being replaced by 'd' and exclamation marks being replaced by various glyphs.

Additionally, if you have accented characters (and select additional languages to English) you find rather more accented characters in the OCR output than in the original!

This is not a criticism - most books are more than readable, simply an acknowledgement that the product is not perfect!
Iain is offline   Reply With Quote
Old 01-27-2012, 12:13 PM   #3
1611mac
eWanderer
1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.
 
Posts: 520
Karma: 1441998
Join Date: Jul 2010
Location: NC, USA
Device: iMac,iPad3,iPhone5-Kindle Fire,Touch,PaperWhite
Quote:
Originally Posted by Iain View Post
This is not a criticism - most books are more than readable, simply an acknowledgement that the product is not perfect!
I did not intend to suggest the the product could do the impossible by making perfect text out of bad scanning procedures, poor bad original text, unusual input, etc.

What I intended to convey was that it is that the product is CAPABLE of reproducing original copy and that with good clean original copy good results are possible (i.e.: fast workflow, intuitive interface, etc, etc.) and that paid software can be worth it sometimes (as opposed to freeware/shareware).

As info, my original text was printed a bit heavy (ink) and was blurry, it was far from being a "perfect" page to scan.

Last edited by 1611mac; 01-27-2012 at 12:16 PM.
1611mac is offline   Reply With Quote
Old 01-27-2012, 12:23 PM   #4
1611mac
eWanderer
1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.
 
Posts: 520
Karma: 1441998
Join Date: Jul 2010
Location: NC, USA
Device: iMac,iPad3,iPhone5-Kindle Fire,Touch,PaperWhite
Quote:
Originally Posted by Iain View Post
Additionally, if you have accented characters (and select additional languages to English) you find rather more accented characters in the OCR output than in the original!
Yes, I read a lot of 18th century material and I think my scans of them won't turn out too well. (v's for u's, f's for s's) etc.
1611mac is offline   Reply With Quote
Old 01-27-2012, 04:46 PM   #5
adv_dp_fan
Connoisseur
adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!adv_dp_fan will blow your mind, man!
 
Posts: 96
Karma: 57138
Join Date: May 2010
Device: Sony 505, iPad 1 & 3, Galaxy Note 8.1
11-12 seconds to scan a page? Ouch! I'd never get a book done.
adv_dp_fan is offline   Reply With Quote
Old 01-27-2012, 06:30 PM   #6
1611mac
eWanderer
1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.1611mac ought to be getting tired of karma fortunes by now.
 
Posts: 520
Karma: 1441998
Join Date: Jul 2010
Location: NC, USA
Device: iMac,iPad3,iPhone5-Kindle Fire,Touch,PaperWhite
Quote:
Originally Posted by adv_dp_fan View Post
11-12 seconds to scan a page? Ouch! I'd never get a book done.
I used what I had on hand. 11-12 seconds a page is better than having no ebook at all. I have not built my two camera auto system yet.....
1611mac is offline   Reply With Quote
Old 01-27-2012, 07:05 PM   #7
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hello

Agree, one plays with the cards one has…
DDHarriman is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ABBYY FineReader Sale anamardoll General Discussions 15 02-20-2013 12:25 PM
Abbyy Finereader 11 Pro $99 chainring Deals, Freebies, and Resources (No Self-Promotion) 6 02-13-2012 08:12 AM
PRS-650 OCR software/Abbyy Finereader-Highlighting –Export pdf w.notes, highlighted passages wonderose Sony Reader 4 04-27-2011 11:41 PM
Abbyy FineReader Dictionaries Mebyon Workshop 2 02-10-2010 03:57 PM
ABBYY FineReader cannot see images chinesealbumart Workshop 8 05-16-2009 12:03 AM


All times are GMT -4. The time now is 12:19 AM.


MobileRead.com is a privately owned, operated and funded community.