Many thanks. My problem were not the regexp to use to remove previous artifacts form the text file, but the fact that I couldn't get pyglossary to work anymore, probably due to upgrades and mixups in my python installation.
Markismus
has also provided a version, with some specific tweaks for KOreader, btw