Those functions look good for cleaning out bad characters, but i was also thinking about attempting to detect and strip subtitles, since they can screw up AND searches. If you're amenable to it I can try to add that.
Separate issue, I was trying to get direct searches to Overdrive working, their search engine uses JSON, and it requires a specific Content-Type header in order to return JSON results. Mechanize is using a generic hard-coded Content-Type header for HTTP posts, and br.addheaders doesn't seem to be able to override it. This is what I tried:
Code:
br.addheaders = [('Content-Type', 'application/json; charset=utf-8')]
Other headers get inserted with that technique, but not the one above. Is there an easy fix for this, or does it require changes in browser.py?
Based on this URL I suspect that a different type of browser handler needs to be defined:
http://wwwsearch.sourceforge.net/mec...-added-headers Not entirely sure how to go about that though.