KevinH, Thanks for all the info. No, the files were not created on a mac. They were built on Linux and tested on Linux and Windows.
Actually it turns out that they seem to have no EOL characters at all. the "tr" command didn't change anything in the file. I had guessed Mac format because that's what notepad++ guessed.
In the end I used perl to add linebreaks between all tags (e.g. "s/></>\n</g"). That turns out to be overkill, but at least the file is readable and editable.
The clean-up tools you linked to work very well indeed.