This is what I have now, and it works, but just looks ugly.
It replaces the em, en, and dashes in Hx's, even multiples, and then shrinks multiple spaces to a single space (up to 10)
Is there a way to make it a little more elegant (and maintainable)?
Find:
<([Hh][1-6])>(.*?)</\1>
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
return match.group().replace("-"," ").replace("–"," ").replace("—"," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ").replace (" "," ")