Use a custom column or a tag to clearly distinguish between the books that have already been 'cleaned' and the rest.
Proceed step by step, looking for books that have e.g. no date of publication, no author, multiple formats etc. You can search for them, use the tag browser or use the table view sorted according to various criteria (you can install the View Manager plugin for your regular and your 'edit' view).
Be patient... really patient

.
Personally, I wouldn't remove all the formats but epub, as it doesn't save that much space - but that's just me.