View Single Post
Old 09-13-2014, 02:34 AM   #26
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by shevirsy View Post
But if you say "we got the message already" would you PLEASE answer the ones I asked, before you answer the ones I haven't asked?
I already linked to a Wikipedia article showing off a comparison of many different OCR programs in Post #13 right in this topic:

https://www.mobileread.com/forums/sho...2&postcount=13

Here is the Wikipedia link again:

https://en.wikipedia.org/wiki/Compar...ition_software

Most likely the only free OCR of note would be Tesseract (and most of the Free OCR programs out there would use (most likely an outdated) version of Tesseract in the backend).

I already explained many of the disadvantages of the free solutions above. Although you are free to read the Tesseract documentation and do much of the training/tweaking needed.

I personally would just err on the side of the paid OCR programs, ESPECIALLY when dealing with non-English works, or works with lots of accented characters. While the proprietary OCR programs are not zero dollars initially, they would save you A TON of time in all of your post-OCR processing steps (which is where you WILL spend most of your time). The more accurate/clean you can get your input, you will have to spend MUCH less time cleaning, and getting the document into a readable state.

Besides that, you can use GIMP/Inkscape/Imagemagick in order to manipulate the images fine. I prefer using all free software over proprietary whenever I can, but sadly, OCR is just one area where the free solutions don't hold much of a candle.

Last edited by Tex2002ans; 09-13-2014 at 02:37 AM.
Tex2002ans is offline   Reply With Quote