View Single Post
Old 10-05-2015, 07:24 AM   #19
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 203
Karma: 62362
Join Date: Jul 2015
Device: Sony
I looked at pulps.ini in the previous posting and was surprised to see that a number of errors in this file are also found when I OCR old publications - in my view this is a poor reflection of the quality of some OCR software.

A few comments about pulps.ini:

1. It includes some words that could be correct eg modem (although old pulp publications are unlikely to include this word). I think it is easier to look for these words manually rather than have them corrected automatically and then have to search through the corrected text to look for where they have been changed incorrectly so that they can be changed back again; however, this is a matter of preference.

2. It does not allow for corrections to be made when the error is next to a punctuation mark eg it will correct " bn " to " on " but will not correct " bn."

3. It does not allow for replacing apostrophes with the appropriate symbol (' or ’)

My plugin at https://www.mobileread.com/forums/sho...d.php?t=264378 overcomes the second problem and, to some extent, it overcomes the last problem (although the words with apostrophes have to be put into the plugin itself at the moment so that the correct apostrophe can be inserted in the ePub - I will work on a way to enable apostrophe's in the external word file to be inserted in the text in the correct form). Currently my plugin will use the correct apostrophe when it corrects the errors: Td, Tve, Til, Fve, Fm, Tm, Vm, IVe and Fd.

I have taken the liberty of taking many of the incorrectly spelt words and their corrections from your ini file and putting them in a file that can be read by my plugin. I have included some words with apostrophes from your list; my plugin will use the same apostrophe as that provided in the list and not use the most appropriate form at the moment.
CalibUser is offline   Reply With Quote