I'm incompetent in Regex, so I have a fairly laborious procedure, which gets done in Notepad++ after any necessary scanning/OCR processes and cleaning-up line breaks:
(I prefer double quotes for speech, single quotes for abbreviations, apostrophes etc)
-insert <p> at start of first line;
-change all carriage-return/new-lines to </p>\r\n\r\n<p>;
- insert </p> at end of last line;
-change all <p>" to <p>“
-change all "</p> to ”</p>;
-change all ^"space to ^”space (where ^ may be stop, comma, query or bang);
-change all ^space" to ^space“ (where ^ may be stop, comma, query, bang, colon or semi-colon);
-by now the number of instances of spacequote and quotespace should be sufficiently few to permit individual search/replace with double or single quotes as required - several passes may be required.
- run through, tracking down the last few instances of quotes, then do a mega replace of single quotes with ’ for the abbreviations.
-sort out the ndashes and ellipses;
Tedious, but it gets me there in the end - a typical SF book of 8 signatures will take me 3 to 6 hours to read, correct and edit, i.e. from OCR-produced text file through to Sigil-ready html.
I find it pays to use named entities - it's particularly helpful when converting a text that has single quotes for direct speech into double-quoted speech marks. I suspect there are various magic formulas in Regex which could do the job as well. If I can find a few spare brain cells one day, I may try going down that route.
Bottom line - no easy solution