I have a whole lot of 'Join' saved searches cleanups I run.
Only the first 2 do I run as 'replace ALL' I also adjust the selector as needed.
The tricky ones are
honorifics. Not perfect, but it gets 90+%
(copied from my sigl_searches.ini)
Code:
63\Name=Cleanup/Joins/Join to upper
63\Find="([[:alpha:],][\"\x201d\xe2\x80\x9d]*)</p>\\s*<p\\b[^>]*>([A-Z\xe2\x80\x9c\"])"
63\Replace=\\1 \\2
64\Name=Cleanup/Joins/To Lower
64\Find="\\s*([a-z],*)</p>\\s+<p class=\"calibre1\">([a-z])"
64\Replace=\\1 \\2
65\Name=Cleanup/Joins/Join span Paras
65\Find="(?sm)([[:alpha:],])</span></p>\\s*<p class=\"MsoNormal1\"><span class=\"calibre5\">([a-z])"
65\Replace=\\1 \\2
66\Name=Cleanup/Joins/Upper-Upper
66\Find="([A-Z,][\"\x201d\xe2\x80\x9d]*)</p>\\s*<p\\b[^>]*>([A-Z\xe2\x80\x9c\"])"
66\Replace=\\1 \\2
67\Name=Cleanup/Joins/Trailing lower
67\Find="([a-z\\,])</p>\n\n <p class=\"calibre\\d+\">"
67\Replace="\\1 "
68\Name=Cleanup/Joins/Initials
68\Find=([A-Z]\\.)</p>\\s*<p\\b[^>]*>([\"\xe2\x80\x9c]*[A-Z])
68\Replace=\\1 \\2
69\Name=Cleanup/Joins/RTGlwrUPR
69\Find=([a-z])([A-Z])
69\Replace=
70\Name=Cleanup/Joins/Join lower dehyphen
70\Find="([[:alpha:],]\x9d*)-</p>\\s*<p\\b[^>]*>([a-z\x201c\x80\x9c])"
70\Replace=\\1\\2
71\Name=Cleanup/Joins/unsplit w hyphen
71\Find="([[:alpha:],]\xe2\x80\x9d*)-</p>\\s*<p\\b[^>]*>([a-z\xe2\x80\x9c])"
71\Replace=\\1-\\2
72\Name=Cleanup/Joins/LC join P
72\Find="</p>\\s+<p class=\"calibre\\d+\">((<i class=\"calibre\\d+\">)*[a-z])"
72\Replace=" \\1"
73\Name=Cleanup/Joins/Join P rem Heyphen
73\Find=([[:alpha:]])-</p>\\s*<p\\b[^>]*>
73\Replace=\\1
74\Name=Cleanup/Joins/Honorifics
74\Find="(Mr|Mrs|Ms|Dr|Prof)\\.</p>\\s+<p class=\"calibre\\d+\">([A-Z])"
74\Replace=\\1. \\2
75\Name=Cleanup/Joins/de BR punct
75\Find="([[:punct:]])<br class=\"calibre4\" />\\s+(\"*[A-Za-z\xe2\x80\x9c])"
75\Replace="\\1</p><p class=\"calibre3\">\\2"