View Single Post
Old 11-08-2015, 07:59 AM   #116
CalibUser
Addict
CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.CalibUser goes to eleven.
 
Posts: 201
Karma: 62362
Join Date: Jul 2015
Device: Sony
Quote:
Originally Posted by Steadyhands View Post
Thanks for a great plug in.
You're welcome. I'm glad you find it useful.

Quote:
Originally Posted by Steadyhands View Post
For years now I've been slowly building a saved search to tidy up epubs. I think there are a few in my total list that could be added to ePubTidyTool. I've got searches for Joining Paragraphs, Split Names Mr. , Mrs. Etc, broken or Split Speach, Common OCR Spelling Mistakes.

If you are interested I can send my sigil_searches.ini?
Thanks for offering to enhance this plugin with your own searches; I would be very interested in incorporating your saved searches in the code for the plugin to enhance its features.

Although the plugin already contains code for some of the things you mentioned (eg Joining Paragraphs and Split Names Mr. , Mrs. , etc) if your code improves on the code in the plugin or if your code can, eg, join paragraphs that are not covered by the plugin, then I would be very keen to include your code for these functions.

Split speeches have been problematic; I have not had time to develop code that can cope with this problem. At present I use a few manual search and replace regex expressions for this (not yet included in the plugin) but I would like to automate this if possible. I would like to adapt your expressions for fixing split speeches if possible, particularly if these can automate the process.

I have looked at your sample contractions; many of these could go in the file that contains a customised list of words to be corrected automatically; the contractions that could not go in this file are those that use the pipe (|) character - this is used to separate the incorrect word from the correct word in the customised word list; I need to consider an alternative character to use in this file so that the pipe character can be used in expressions. Can anybody see a problem if the character ¬ is used as the separator (other suggestions welcome)?

Before I add any more features to the plugin I would like to rewrite the code so that it uses the facilities provided by Sigil 0.9; I will not be able to start on this before next weekend! Meanwhile, if you could post a file in the format that is described in the section 'Using a customised list of words that are corrected automatically' in the manual for this plugin that contains (1) common OCR spelling mistakes from your searches and (2) corrections to contractions (and anything else) that do not use the pipe character , then I can append it to the file IncorrectWords.txt that is in the first post for other users to use.
CalibUser is offline   Reply With Quote