The server you are contacting is failing, probably ecause it needs some cookies set or something similar. Add this to your recipe to check:
Code:
def preprocess_raw_html(self, html, url):
with open('/t/raw.html', 'wb') as f:
f.write(html.encode('utf-8'))
return html
change the '/t/raw.html' above to some path on your computer and open the resulting raw.html after the download to see what actual html the servr is sending.