A few months ago, I needed to send a referrer in a recipe. Kovid advised that Calibre sends no referrer, but I could monkey with mechanize to force it. Based on his code, I came up with this:
Code:
def get_browser(self):
br = BasicNewsRecipe.get_browser(self)
orig_open_novisit = br.open_novisit
def my_open_no_visit(url, **kwargs):
req = mechanize.Request(
url,
headers = {
'Referer':'http://www.gocomics.com/',
})
return orig_open_novisit(req)
br.open_novisit = my_open_no_visit
return br
(BTW, the header spells it "Referer") This code works fine more than 99.9% of the time, but on big recipes, that's not good enough. The whole recipe fails if it has an error, so eventually, it fails. I've ignored it, as I don't really need the large recipe to run very often - it's OK if it only runs every other time or so, and it does. The failures are inconsistent as far as I can tell. Here is the error:
Code:
Python function terminated unexpectedly
(Error Code: 1)
Traceback (most recent call last):
File "site.py", line 103, in main
File "site.py", line 85, in run_entry_point
File "C:\Util\Calibre2\src\src\calibre\utils\ipc\worker.py", line 99, in main
result = func(*args, **kwargs)
File "C:\Util\Calibre2\src\src\calibre\gui2\convert\gui_conversion.py", line 24, in gui_convert
plumber.run()
File "C:\Util\Calibre2\src\src\calibre\ebooks\conversion\plumber.py", line 815, in run
accelerators, tdir)
File "C:\Util\Calibre2\src\src\calibre\customize\conversion.py", line 211, in __call__
log, accelerators)
File "C:\Util\Calibre2\src\src\calibre\web\feeds\input.py", line 104, in convert
ro.download()
File "C:\Util\Calibre2\src\src\calibre\web\feeds\news.py", line 702, in download
res = self.build_index()
File "C:\Util\Calibre2\src\src\calibre\web\feeds\news.py", line 851, in build_index
feeds = feeds_from_index(self.parse_index(), oldest_article=self.oldest_article,
File "c:\users\appdata\local\temp\calibre_0.7.4_qzoqyx_recipes\recipe0.py", line 279, in parse_index
articles = self.make_links(url)
File "c:\users\appdata\local\temp\calibre_0.7.4_qzoqyx_recipes\recipe0.py", line 291, in make_links
page_soup = self.index_to_soup(url)
File "C:\Util\Calibre2\src\src\calibre\web\feeds\news.py", line 474, in index_to_soup
with closing(open_func(url_or_raw)) as f:
File "c:\users\appdata\local\temp\calibre_0.7.4_qzoqyx_recipes\recipe0.py", line 52, in my_open_no_visit
return orig_open_novisit(req)
File "site-packages\mechanize-0.1.11-py2.6.egg\mechanize\_mechanize.py", line 205, in open_novisit
File "site-packages\mechanize-0.1.11-py2.6.egg\mechanize\_mechanize.py", line 261, in _mech_open
mechanize._response.httperror_seek_wrapper: HTTP Error 502: Proxy Error
Info on the HTTP Error 502: Proxy Error (bad gateway) is sparse. Does anyone have an idea how to solve this, or if not solve it, how to isolate it so that the entire recipe does not fail? By breaking the recipe down into smaller recipes, I don't see this problem as often.