When cleaning the text and parsing it before passing it to the spell checker it should be easy to filter out entities like & s h y ; and its numerical equivalents. But truly, soft-hyphenating words is probably best left to the very last step, after all other changes including spellchecking are completed.
So I recommend removing all soft-hyphens from the document using search and replace, until the text and epub are in an "as desired" state and then using a hyphenation library to add back in soft-hyphens if and only if you are producing an epub for readers that support them.
|