That is the perils of PDF
There is no perfect conversion.
You are lucky if there is even a close conversion .
May your REGEX foo get stronger because that is what you need (using the Editor or Sigil)
Pass after pass of carefully thought out Searches (If you do them in the wrong order, you make later pattern matches more difficult.
First I would remove the standalone page number lines.
Then I would remove the mostest junk that can be done with a single pattern.
BACKUP before each new cleaning pass in case you get it WRONG (discard the current bad edit)