08-06-2019, 10:20 AM | #16 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Tried that now
Code:
Python function terminated unexpectedly HTTP Error 301: Moved Permanently (Error Code: 1) Traceback (most recent call last): File "site.py", line 101, in main File "site.py", line 78, in run_entry_point File "site-packages\calibre\utils\ipc\worker.py", line 200, in main File "site-packages\calibre\gui2\convert\gui_conversion.py", line 35, in gui_convert_recipe File "site-packages\calibre\gui2\convert\gui_conversion.py", line 27, in gui_convert File "site-packages\calibre\ebooks\conversion\plumber.py", line 1110, in run File "site-packages\calibre\customize\conversion.py", line 246, in __call__ File "site-packages\calibre\ebooks\conversion\plugins\recipe_input.py", line 138, in convert File "site-packages\calibre\web\feeds\news.py", line 898, in __init__ File "<string>", line 55, in get_browser File "site-packages\mechanize\_mechanize.py", line 254, in open File "site-packages\mechanize\_mechanize.py", line 310, in _mech_open mechanize._response.httperror_seek_wrapper: HTTP Error 301: Moved Permanently |
08-06-2019, 10:25 AM | #17 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
duplicate
|
Advert | |
|
08-06-2019, 10:41 AM | #18 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
catch the error, it should have the information about the redirect.
|
08-06-2019, 10:49 AM | #19 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
I stole the WSJ code and it seems to work again with that hacked down. Submitting PR now
|
08-30-2019, 08:25 AM | #20 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Kovid - it looks like they made further changes where https://account.thetimes.co.uk/login no longer works as it now has a state variable that comes from the redirect at login.thetimes.co.uk. This looks similar to the WSJ SSO format but not quite. Do you recognize this style from other recipes per chance?
|
Advert | |
|
08-30-2019, 09:38 AM | #21 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I had to recently update the WSJ recipe, maybe you can try those changes.
|
08-31-2019, 12:42 PM | #22 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
the Times doesn't seem to use quite the same javascript - it's weird. does it look familar at all to you? it's a sso type login but not the same variables used at all for example
|
08-31-2019, 01:11 PM | #23 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
I've tried updating all the variables to match but it hangs on the initialization
https://login.thetimes.co.uk/ send: 'GET / HTTP/1.1\r\nAccept-Encoding: identity\r\nHost: login.thetimes.co.uk\r\nConnection: close\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.75 Safari/537.36\r\nAccept: */*\r\n\r\n' |
08-31-2019, 01:14 PM | #24 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
it's odd the website functions fine albeit with a bunch of forwards when you inspect in chrome
|
08-31-2019, 02:11 PM | #25 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
Code:
if needs_subscription: def get_browser(self, *a, **kw): # To understand the login logic read app-min.js from # https://sso.accounts.dowjones.com/login itp = quote(self.INDEX, safe='') start_url = 'https://login.thetimes.co.uk/' kw['user_agent'] = random_user_agent(allow_ie=False) br = BasicNewsRecipe.get_browser(self, *a, **kw) br.set_debug_http(True) self.log('Starting login process...') self.log(start_url) res = br.open(start_url) sso_url = res.geturl() self.log('Get sso URL') self.log(sso_url) query = urlparse.parse_qs(urlparse.urlparse(sso_url).query) |
08-31-2019, 07:59 PM | #26 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
No cant say I've seen it before. If the website is working with a bunch of forwards, i suggest you use the last URL in the chain to start the ogin process instead.
|
09-01-2019, 08:09 AM | #27 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
looking at it, it seems to be using the same auth0 as WSJ, but the initial URL forward is just not forwarding as expected. there's no forwarding disabled, but there are a sequence of forwards that occur normally which are too complex for me to comprehend, but seemingly similar to teh WSJ. why wouldn't the forwrading work though? isn't beautifulsoup automatically forwarding?
|
09-01-2019, 08:16 AM | #28 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
the relevant software is mechanize, not beautifulsoup and in a browser forwarding can be done in any number of ways, princially using javascript in which case it would not work in mechanize, since it does not understand javascript.
|
09-01-2019, 08:17 AM | #29 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There JS heavy logins are a huge pain but once calibre 4 is released it should by easy to automate them using a headless Qt WebEngine instance. Do the login in the browser and extract the cookies and insert them into mechanize.
|
09-01-2019, 02:44 PM | #30 |
Big Poppa
Posts: 110
Karma: 10
Join Date: Jul 2010
Device: Nook
|
can you point me to a recipe that does that? i just spent an hour trying to do this in python and pulling my hair out. i should have just done this all in nodejs i think myself mechanize is my kryptonite.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
New Yorker recipe failing | adfadfsasdfafafd | Recipes | 2 | 03-28-2017 02:47 PM |
"The New York Times" recipe failing with error | mikebw | Recipes | 8 | 10-02-2015 05:48 PM |
New York Times Recipe failing to verify SSL Cert | Ramblurr | Recipes | 6 | 02-27-2015 04:31 AM |
download of wsj failing | amritsari | Calibre | 4 | 09-06-2012 05:53 PM |
Economist Recipe Failing... | awitko | Recipes | 2 | 11-06-2011 11:47 PM |