View Single Post
Old 07-11-2016, 06:43 AM   #23
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,746
Karma: 24032915
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by GrannyGrump View Post
OCR VILLAINS:
I had a look at the documentation for the Hunspell library, which appears to have been written by a programmer who does his taxes in binary, and found out that it's possible to add custom letter replacements to get betters spelling suggestions.

Replacements need to be defined in the affix file (e.g. en_US.aff for US English), which is a plain text file that can be edited with a programmer's editor, e.g. Notepad ++.

The format is as follows

Code:
REP {number of following entries}
REP {OLD} {NEW}
For example the original replacement section in en_US.aff looks like this:

Code:
REP 94
REP nt n't
...
...
REP shun tion
REP shun sion
REP shun cion
Based on your OCR villains list, I've created a custom list, added it after the last entry and updated the replacement count to REP 127 (94 existing entries + 33 new ones):

Spoiler:
Code:
REP e c
REP c e
REP h b
REP b h
REP H ll
REP H li
REP h li
REP hn lm
REP rn m
REP m rn
REP ri n
REP n ri
REP r f
REP m in
REP in m
REP im un
REP un im
REP n u
REP ii u
REP B R
REP R B
REP F P
REP P F
REP ih th
REP di th
REP tii th
REP tli th
REP Tm "I'm
REP U ll
REP T li
REP T il
REP vv w
REP y v
REP v y


With this change in place, the first suggestion for "ahnost" is no longer stenost, but almost and the suggestion for "hke" is like instead of hike.

If you want to test my modified file:

1. Go to C:\Program Files\Sigil\hunspell_dictionaries
2. Create a backup copy of en_US.aff.
3. Overwrite en_US.aff with the attached version. (You'll need to confirm a system warning.)
Attached Files
File Type: zip en_US.aff.zip (14.6 KB, 364 views)

Last edited by Doitsu; 07-22-2016 at 03:17 AM.
Doitsu is offline   Reply With Quote