Suppressing <br /> tags only in "body text" style.
Could there be a way to destroy the soft hyphens only when they are included in a "body text" paragraph?
Rationale:
After using a new (and not perfect) OCR , I found that my recognized text was interspersed with a lot of <br /> tags (soft hyphens?). I usually insert the html file in OpenOffice and clean all formatting to begin with. Even this way, I realized that these resilient tags survived.
It is not that bad. Some poems or songs are thus nicely transcribed. On the other hand, I have to clean these tags for many standard paragraphs of text.
Sigil provides a simple way out. The user has a choice either cleaning every one of them, good and bad, or selectively and patiently suppress the useless tags...
There could a better one.
Give your songs or poems their own style, keep standard text in its "body text" class and then launch the following Regex...
|