This may be of assistance - it's not very elegant (in fact it's totally lacking in elegance), but it works, eventually, for me:
In your favoured text editor (mine is NoteTab Lite) perform the following steps (note that the $ character is being used here to denote a space):
- replace all instances of ^P^P^P with ^P^P (triple para to double para)
- replace all instances of ^P^P with ||
- replace all instances of ^P with $
- replace all instances of || with </p>^P^P<p>
- add a leading <p> at start of document and a </p> at the end
You should now have a text that is neatly broken into paragraphs with zero hard line endings. Then:
- replace <p>' with <p>“
- replace '</p> with ”</p>
- replace .'$ with .”$
- ditto for comma, colon, semi-colon, query and bang followed by '$
- replace $' (or $" depending on text) with $‘
- use the above process to identify the '$ (or "$) and replace with ’$
Previous two steps should clean most of the quoted phrases ('Marie Celeste' eg and plural possessives survivors' eg)
- replace .$' with $“
- ditto for comma, colon, semi-colon, query and bang followed by $'
- should be fairly safe now to replace all remaining instances of ' with ’
This is not bullet-proof, but does clean up the text reasonably well. I generally reckon on 1 to 2 hours doing the above plus other odds and sods before moving the text into Sigil, where the html entities will be translated into 'proper' text.
|