View Single Post
Old 03-02-2011, 03:24 AM   #36
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
That did the trick for the JSON Query, next and hopefully final major stumbling block.


Edit, I think maybe the best way to fix the problem below is to delete the last cookie in the cookiejar, br._ua_handlers['_cookies'].cookiejar. Looks like this printed as a string:
Code:
<cookielib.CookieJar[<Cookie ASP.NET_SessionId=jfvfj1554sbio555e3nrfwjd for search.overdrive.com/>, <Cookie expires=1298969952 for search.overdrive.com/>]>
Not sure how to go about actually doing that though, as it's an instance and not a list object. I tried to use cookielib's clear() function, but it doesn't seem to work, probably because this cookie is corrupted in the first place and doesn't use the structure mechanize/cookielib expects.

The other option would be to create a separate copy of the cookiejar and use a separate browser object to load the bad page. But I'm struggling to figure out how to duplicate a cookiejar object as well. I've got the separate page loader working with urllib2.


== original description ==

Weird problem, not sure how to fix it. Basically one of the pages I have to retrieve sets a cookie with no name:
Code:
Set-Cookie: ; expires=Tue, 01-Mar-2011 08:15:21 GMT; path=/
And this causes mechanize to barf when it moves on to the next request:
Code:
Traceback (most recent call last):
  File "/Users/ldolse/calibredev/heuristics/src/calibre/ebooks/metadata/overdrive.py", line 112, in to_ovrdrv_data
    ovrdrv_data = find_ovrdrv_data(br, title, author, isbn)
  File "/Users/ldolse/calibredev/heuristics/src/calibre/ebooks/metadata/overdrive.py", line 95, in find_ovrdrv_data
    return overdrive_search(br, q, title, author)
  File "/Users/ldolse/calibredev/heuristics/src/calibre/ebooks/metadata/overdrive.py", line 53, in overdrive_search
    raw = br.open_novisit(xreq).read()
  File "site-packages/mechanize/_mechanize.py", line 199, in open_novisit
  File "site-packages/mechanize/_mechanize.py", line 230, in _mech_open
  File "site-packages/mechanize/_opener.py", line 188, in open
  File "site-packages/mechanize/_urllib2_fork.py", line 1188, in http_request
  File "lib/python2.7/cookielib.py", line 1331, in add_cookie_header
  File "lib/python2.7/cookielib.py", line 1290, in _cookie_attrs
TypeError: expected string or buffer

At least that's my assumption - this is the only page that sets a cookie header like that, and it sets it for a regular browser as well - it's not related to the plugin. Any way to get Mechanize to ignore the garbage set-cookie header?

Last edited by ldolse; 03-02-2011 at 06:28 AM.
ldolse is offline   Reply With Quote