OK I've tried readability and it doesn't do what I need. However, I've done some reading and found "PackageThis" on codeplex at
here
This has source code included which I sadly don't fully understand. I've done some more reading and managed to get the SOAP webservice loading in visual studio but it feels like slow going!
If anyone reads this and knows SOAP programming I'd love to hear from you

This access method will allow anyone to fully download either Technet or MSDN in fully compliant XHTML which is obviously ideal for ebooks.
Cheers
Dave