View Single Post
Old 03-01-2011, 09:31 PM   #30
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Those functions look good for cleaning out bad characters, but i was also thinking about attempting to detect and strip subtitles, since they can screw up AND searches. If you're amenable to it I can try to add that.

Separate issue, I was trying to get direct searches to Overdrive working, their search engine uses JSON, and it requires a specific Content-Type header in order to return JSON results. Mechanize is using a generic hard-coded Content-Type header for HTTP posts, and br.addheaders doesn't seem to be able to override it. This is what I tried:
Code:
br.addheaders = [('Content-Type', 'application/json; charset=utf-8')]
Other headers get inserted with that technique, but not the one above. Is there an easy fix for this, or does it require changes in browser.py?

Based on this URL I suspect that a different type of browser handler needs to be defined:
http://wwwsearch.sourceforge.net/mec...-added-headers Not entirely sure how to go about that though.
ldolse is offline   Reply With Quote