Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 07-15-2016, 06:50 AM   #1
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Irish Times - Problems Entering Subscription

Hello all,

I'm looking for help entering email & password details into the following page:

http://www.irishtimes.com/signin

I've been trying to use code from other recipes with subscription models but not having much success. So far I've come up with the following modified recipe:

Code:
__license__  = 'GPL v3'
__copyright__ = "2008, Derry FitzGerald. 2009 Modified by Ray Kinsella and David O'Callaghan, 2011 Modified by Phil Burns, 2013 Tom Scholl"
'''
irishtimes.com
'''
import urlparse, re

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile


class IrishTimes(BasicNewsRecipe):
    title          = u'The Irish Times'
    __author__    = "Derry FitzGerald, Ray Kinsella, David O'Callaghan and Phil Burns, Tom Scholl"
    description = 'Daily news from The Irish Times'
    needs_subscription = True

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('http://www.irishtimes.com/signin')
            br.form = br.forms().next()       
	    br['email']   = self.username
            br['password'] = self.password
            raw = br.submit().read()
	    if 'Please try again' in raw:
                raise Exception('Your username and password are incorrect')
        return br

    language = 'en_IE'

    masthead_url = 'http://www.irishtimes.com/assets/images/generic/website/logo_theirishtimes.png'

    encoding = 'utf-8'
    oldest_article = 1.0
    max_articles_per_feed = 100
    remove_empty_feeds = True
    no_stylesheets = True
    temp_files = []
    articles_are_obfuscated = True

    feeds          = [
                      ('News', 'http://www.irishtimes.com/cmlink/the-irish-times-news-1.1319192'),
                      ('World', 'http://www.irishtimes.com/cmlink/irishtimesworldfeed-1.1321046'),
                      ('Politics', 'http://www.irishtimes.com/cmlink/irish-times-politics-rss-1.1315953'),
                      ('Business', 'http://www.irishtimes.com/cmlink/the-irish-times-business-1.1319195'),
                      ('Culture', 'http://www.irishtimes.com/cmlink/the-irish-times-culture-1.1319213'),
# Not interested in sport so commented out..                     
#		  ('Sport', 'http://www.irishtimes.com/cmlink/the-irish-times-sport-1.1319194'),
                      ('Debate', 'http://www.irishtimes.com/cmlink/debate-1.1319211'),
                      ('Life & Style', 'http://www.irishtimes.com/cmlink/the-irish-times-life-style-1.1319214'),
                    ]


    def get_obfuscated_article(self, url):
        # Insert a pic from the original url, but use content from the print url
        pic = None
        pics = self.index_to_soup(url)
        div = pics.find('div', {'class' : re.compile('image-carousel')})
        if div:
            pic = div.img
            if pic:
                try:
                    pic['src'] = urlparse.urljoin(url, pic['src'])
                    pic.extract()
                except:
                    pic = None

        content = self.index_to_soup(url + '?mode=print&ot=example.AjaxPageLayout.ot')
        if pic:
            content.p.insert(0, pic)

        self.temp_files.append(PersistentTemporaryFile('_fa.html'))
        self.temp_files[-1].write(content.prettify())
        self.temp_files[-1].close()
        return self.temp_files[-1].name
I've been entering the wrong password to verify that the login is occurring but no success. It could be perhaps incorrect form or submit names.

Can anyone point me in the right direction?

Thanks,

Leo
leo738 is offline   Reply With Quote
Old 07-16-2016, 06:17 AM   #2
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Some progress, I'm now getting a response from the website. However it saying that it's an invalid username or password (even if the correct ones are used), probably because the fields aren't being filled in correctly.

Perhaps I'm not selecting the correct form (I think it 'itPaywall').

Code:
__license__  = 'GPL v3'
__copyright__ = "2008, Derry FitzGerald. 2009 Modified by Ray Kinsella and David O'Callaghan, 2011 Modified by Phil Burns, 2013 Tom Scholl"
'''
irishtimes.com
'''
import urlparse, re

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile


class IrishTimes(BasicNewsRecipe):
    title          = u'The Irish Times'
    __author__    = "Derry FitzGerald, Ray Kinsella, David O'Callaghan and Phil Burns, Tom Scholl"
    description = 'Daily news from The Irish Times'
    needs_subscription = True

    language = 'en_IE'

    masthead_url = 'http://www.irishtimes.com/assets/images/generic/website/logo_theirishtimes.png'

    encoding = 'utf-8'
    oldest_article = 1.0
    max_articles_per_feed = 100
    remove_empty_feeds = True
    no_stylesheets = True
    temp_files = []
    articles_are_obfuscated = True

    feeds          = [
                      ('News', 'http://www.irishtimes.com/cmlink/the-irish-times-news-1.1319192'),
                      ('World', 'http://www.irishtimes.com/cmlink/irishtimesworldfeed-1.1321046'),
                      ('Politics', 'http://www.irishtimes.com/cmlink/irish-times-politics-rss-1.1315953'),
                      ('Business', 'http://www.irishtimes.com/cmlink/the-irish-times-business-1.1319195'),
                      ('Culture', 'http://www.irishtimes.com/cmlink/the-irish-times-culture-1.1319213'),
# Not interested in sport so commented out..                     
#		  ('Sport', 'http://www.irishtimes.com/cmlink/the-irish-times-sport-1.1319194'),
                      ('Debate', 'http://www.irishtimes.com/cmlink/debate-1.1319211'),
                      ('Life & Style', 'http://www.irishtimes.com/cmlink/the-irish-times-life-style-1.1319214'),
                    ]

    def get_browser(self):
        br = BasicNewsRecipe.get_browser(self)
        if self.username is not None and self.password is not None:
            br.open('http://www.irishtimes.com/signin')
            # is the correct form being selected below????
            br.form = br.forms().next()   
            br['email']   = self.username
            br['password'] = self.password
            raw = br.submit().read()
	    #print raw 
	    if 'Invalid email or password' in raw:
                raise Exception('Your username and password are incorrect')
        return br


    def get_obfuscated_article(self, url):
        # Insert a pic from the original url, but use content from the print url
        pic = None
        pics = self.index_to_soup(url)
        div = pics.find('div', {'class' : re.compile('image-carousel')})
        if div:
            pic = div.img
            if pic:
                try:
                    pic['src'] = urlparse.urljoin(url, pic['src'])
                    pic.extract()
                except:
                    pic = None

        content = self.index_to_soup(url + '?mode=print&ot=example.AjaxPageLayout.ot')
        if pic:
            content.p.insert(0, pic)

        self.temp_files.append(PersistentTemporaryFile('_fa.html'))
        self.temp_files[-1].write(content.prettify())
        self.temp_files[-1].close()
        return self.temp_files[-1].name
leo738 is offline   Reply With Quote
Advert
Old 07-16-2016, 07:45 AM   #3
Aimylios
Member
Aimylios began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Apr 2016
Device: Tolino Vision 3HD
Yes, I think that should select the right form (the first one). Although you could also try this command if you are in doubt:
Code:
            br.select_form(nr=0)
I just had a brief look at the source code of the page and didn't try it out, but I think the string "Invalid email or password" is always included (even if it is not shown). You should remove that to see what happens or find another way to check the login status.
Code:
	    if 'Invalid email or password' in raw:
                raise Exception('Your username and password are incorrect')
Aimylios is offline   Reply With Quote
Old 07-16-2016, 03:35 PM   #4
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Hello,

Many thanks for your reply!

You're correct, that 'Your username and password are incorrect' is present in the page before the submit button is pushed so I edited that section out. Is there a simple way to verify it properly?

As suggested I added the snippet of code for the form & verified that the correct form was selected (by printing it to screen). It outputted:

Code:
<POST https://www.irishtimes.com/signin# application/x-www-form-urlencoded
  <TextControl(email=)>
  <PasswordControl(password=)>
  <SubmitButtonControl(<None>=) (readonly)>>
However it's still failing. Almost all the articles after the first 5 or so have lots of stuff about signing in.

Do I need to handle importing text from the command line argument? I haven’t added anything in that regard. Anything else you can think of?

Thanks again for looking,

Leo
leo738 is offline   Reply With Quote
Old 07-18-2016, 03:09 PM   #5
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Not an issue importing password, or username from the command line
leo738 is offline   Reply With Quote
Advert
Old 07-18-2016, 03:35 PM   #6
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
after running with -vv option it looks like it may be a recipe issue rather than a login problem.
I'm see a lots of occasions of:

Code:
13% Article download failed: UK’s Trident nuclear programme splits Labour three ways
Failed to download article: Nice attack: ‘No words describe hell of bringing one’s child to the cemetery’ from http://www.irishtimes.com/news/world...tery-1.2725455
Traceback (most recent call last):
  File "site-packages/calibre/utils/threadpool.py", line 95, in run
  File "site-packages/calibre/web/feeds/news.py", line 1125, in fetch_obfuscated_article
  File "<string>", line 89, in get_obfuscated_article
ValueError: I/O operation on closed file
&


Code:
Could not fetch image  file:///polopoly_fs/1.2723622.1468599710!/image/image.jpg_gen/derivatives/landscape_140/image.jpg
Traceback (most recent call last):
  File "site-packages/calibre/web/fetch/simple.py", line 377, in process_images
  File "site-packages/calibre/web/fetch/simple.py", line 229, in fetch_url
IOError: [Errno 2] No such file or directory: u'/polopoly_fs/1.2723622.1468599710!/image/image.jpg_gen/derivatives/landscape_140/image.jpg'

Fetching file:///assets/images/icons/apps/app-store.png
&

Code:
20% Article download failed: Half of Irish consumers using contactless payments
Failed to download article: EU re-introduces milk supply controls barely a year after quotas from http://www.irishtimes.com/business/a...otas-1.2726088
Traceback (most recent call last):
  File "site-packages/calibre/utils/threadpool.py", line 95, in run
  File "site-packages/calibre/web/feeds/news.py", line 1125, in fetch_obfuscated_article
  File "<string>", line 89, in get_obfuscated_article
ValueError: I/O operation on closed file
Any ideas?
leo738 is offline   Reply With Quote
Old 12-03-2016, 04:14 PM   #7
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Just getting back to this after a break. It looks like some issue around the submit button.

I've read up a little on the br.submit() command. Could it be that some javascript is needs to be executed to verify the login details after the button press which mechanize is unable to handle? Should I try use use POST instead?

Any help appreciated.

Leo
leo738 is offline   Reply With Quote
Old 12-03-2016, 10:23 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yes, generally when a plain submit() does not work, it means there is javascript behind the scenes. WHat you do then is use the developer tools in a regular browser to see the requests generated by the login page when you click submit and clone them in the recipe. An example of doing that is in the WSJ recipe.
kovidgoyal is offline   Reply With Quote
Old 12-05-2016, 04:37 PM   #9
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Many thanks,

I managed to capture the js:

Code:
Host: www.irishtimes.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Referer: https://www.irishtimes.com/signin
Content-Length: 106
Cookie: IT_cookiepopup=1; pw_meter_news=14815732..................8edbe; pw_cache=0....1480968432.IE.0.0...0xd12fffc3543.........6bb793bc2d38; IT_UUID=69164............b0758e
DNT: 1
Connection: keep-alive
Will try & work out the POST for it.

Leo

Last edited by leo738; 12-05-2016 at 04:48 PM.
leo738 is offline   Reply With Quote
Old 12-07-2016, 07:24 AM   #10
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Managed to get something going:

Code:
__license__  = 'GPL v3'
__copyright__ = "2008, Derry FitzGerald. 2009 Modified by Ray Kinsella and David O'Callaghan, 2011 Modified by Phil Burns, 2013 Tom Scholl"
'''
irishtimes.com
'''
import urlparse, re
import json
from mechanize import Request

from calibre.web.feeds.news import BasicNewsRecipe
from calibre.ptempfile import PersistentTemporaryFile

USER_AGENT = 'Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0'

class IrishTimes(BasicNewsRecipe):
    title          = u'The Irish Times'
    __author__    = "Derry FitzGerald, Ray Kinsella, David O'Callaghan and Phil Burns, Tom Scholl"
    description = 'Daily news from The Irish Times'
    needs_subscription = True

    language = 'en_IE'

    masthead_url = 'http://www.irishtimes.com/assets/images/generic/website/logo_theirishtimes.png'

    encoding = 'utf-8'
    oldest_article = 1.0
    max_articles_per_feed = 100
    simultaneous_downloads = 5
    remove_empty_feeds = True
    no_stylesheets = True
    temp_files = []
    articles_are_obfuscated = True

    feeds          = [
                      ('News', 'https://www.irishtimes.com/cmlink/the-irish-times-news-1.1319192'),
                      ('World', 'https://www.irishtimes.com/cmlink/irishtimesworldfeed-1.1321046'),
                      ('Politics', 'https://www.irishtimes.com/cmlink/irish-times-politics-rss-1.1315953'),
                      ('Business', 'https://www.irishtimes.com/cmlink/the-irish-times-business-1.1319195'),
                      ('Culture', 'https://www.irishtimes.com/cmlink/the-irish-times-culture-1.1319213'),
# Not interested in sport so commented out..                     
#		  ('Sport', 'https://www.irishtimes.com/cmlink/the-irish-times-sport-1.1319194'),
                      ('Debate', 'https://www.irishtimes.com/cmlink/debate-1.1319211'),
                      ('Life & Style', 'https://www.irishtimes.com/cmlink/the-irish-times-life-style-1.1319214'),
                    ]

    def get_browser(self):
        # To understand the signin logic read signin javascript from submit button from
        # https://www.irishtimes.com/signin

        br = BasicNewsRecipe.get_browser(self, user_agent=USER_AGENT)

        url = 'https://www.irishtimes.com/signin'
        br.set_debug_http(True)
        br.open(url).read()
        rurl = 'https://www.irishtimes.com/auth-rest-api/v1/paywall/login'
        rq = Request(rurl, headers={
            'Accept': '*/*',
            'Accept-Language': 'en-US,en;q=0.5',
            'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'Referer': url,
            'X-Requested-With': 'XMLHttpRequest',
        }, data=json.dumps({
            'username': self.username,
            'password': self.password,
            'deviceid': '53c835787f4d2406131985553c1842d0',
            'persistent': 'on',
        }))
        r = br.open(rq)
        if r.code != 200:
            raise ValueError('Failed to login, check username and password')
        data = json.loads(r.read())
        print(data)
        #if data.get('result') != 'success':
        #    raise ValueError(
        #        'Failed to login (XHR failed), check username and password')
        #br.set_cookie('m', data['username'], '.wsj.com')
        #r = br.open(data['url'])
        #self.wsj_itp_page = raw = r.read()
        #if b'>Sign Out<' not in raw:
        #    raise ValueError(
        #        'Failed to login (auth URL failed), check username and password')
        # open('/t/raw.html', 'w').write(raw)
        return br

    def get_obfuscated_article(self, url):
        # Insert a pic from the original url, but use content from the print url
        pic = None
        pics = self.index_to_soup(url)
        div = pics.find('div', {'class' : re.compile('image-carousel')})
        if div:
            pic = div.img
            if pic:
                try:
                    pic['src'] = urlparse.urljoin(url, pic['src'])
                    pic.extract()
                except:
                    pic = None

        content = self.index_to_soup(url + '?mode=print&ot=example.AjaxPageLayout.ot')
        if pic:
            content.p.insert(0, pic)

        self.temp_files.append(PersistentTemporaryFile('_fa.html'))
        self.temp_files[-1].write(content.prettify())
        self.temp_files[-1].close()
        return self.temp_files[-1].name
But the json stuff contains a 'deviceid' which I don't seem to be able to find much stuff on.

Any pointers what it is??

Thanks,

Leo
leo738 is offline   Reply With Quote
Old 12-07-2016, 07:32 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,856
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It's likely an id that is generated using browser fingerprinting and helps track users. You can probably just use a random string for it in the same format as you you got for your browser.
kovidgoyal is offline   Reply With Quote
Old 12-09-2016, 07:30 AM   #12
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Thanks for the reply but not getting very far on this..

On hitting the 'sigin' button the following POST is sent to:

https://www.irishtimes.com/auth-rest-api/v1/paywall/login


Code:
Host: www.irishtimes.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Content-Type: application/x-www-form-urlencoded; charset=UTF-8
X-Requested-With: XMLHttpRequest
Referer: https://www.irishtimes.com/signin
Content-Length: 106
Cookie: IT_UUID=1150a714-be0a-11e6-b6a8-005056b0758e; IT_cookiepopup=1
DNT: 1
Connection: keep-alive
with the request body:

Code:
username=ABCDEF%40gmail.com&password=123456&deviceid=53c835787f4d2406131985633c1942d0&persistent=on
So I came up with the following in the recipe (based on WSJ code & only including the login stuff):

Code:
def get_browser(self):
        # To understand the signin logic read signin javascript from submit button from
        # https://www.irishtimes.com/signin

        br = BasicNewsRecipe.get_browser(self, user_agent=USER_AGENT)

        url = 'https://www.irishtimes.com/signin'
        br.set_debug_http(True)
        br.open(url).read()
        rurl = 'https://www.irishtimes.com/auth-rest-api/v1/paywall/login'
        rq = Request(rurl, headers={
            'Accept': '*/*',
            'Accept-Language': 'en-US,en;q=0.5',
            'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'Referer': url,
            'X-Requested-With': 'XMLHttpRequest',
        }, data=json.dumps({
            'username': self.username,
            'password': self.password,
            'deviceid': '53c835787f4d2406131985633c1842d0',
            'persistent': 'on',
        }))
        r = br.open(rq)
        if r.code != 200:
            raise ValueError('Failed to login, check username and password')
        data = json.loads(r.read())
        print(data)
        #if data.get('result') != 'success':
        #    raise ValueError(
        #        'Failed to login (XHR failed), check username and password')
        #br.set_cookie('m', data['username'], '.wsj.com')
        #r = br.open(data['url'])
        #self.wsj_itp_page = raw = r.read()
        #if b'>Sign Out<' not in raw:
        #    raise ValueError(
        #        'Failed to login (auth URL failed), check username and password')
        # open('/t/raw.html', 'w').write(raw)
        return br
However the response I get is:

Code:
send: 'GET /signin HTTP/1.1\r\nAccept-Encoding: identity\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0\r\nHost: www.irishtimes.com\r\nAccept: */*\r\nConnection: close\r\n\r\n'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: Apache-Coyote/1.1
header: Content-Type: text/html;charset=utf-8
header: Last-Modified: Fri, 09 Dec 2016 12:26:34 GMT
header: X-Cacheable: YES
header: Content-Length: 72338
header: Accept-Ranges: bytes
header: Date: Fri, 09 Dec 2016 12:27:43 GMT
header: Connection: keep-alive
header: X-Pw-Hits: 1
header: Set-Cookie: IT_UUID=e23fb6da-be0a-11e6-bd74-005056a02a54; domain=.irishtimes.com; expires=Thu, 01 Jan 2099 00:00:01 GMT; path=/;
header: Pragma: no-cache
header: Cache-Control: no-cache, no-store, must-revalidate
header: Expires: Thu, 1 Jan 1970 00:00:00 GMT
send: 'POST /auth-rest-api/v1/paywall/login HTTP/1.1\r\nAccept-Encoding: identity\r\nUser-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0\r\nContent-Length: 126\r\nReferer: https://www.irishtimes.com/signin\r\nConnection: close\r\nX-Requested-With: XMLHttpRequest\r\nAccept: */*\r\nHost: www.irishtimes.com\r\nContent-Type: application/x-www-form-urlencoded; charset=UTF-8\r\nCookie: IT_UUID=e23fb6da-be0a-11e6-bd74-005056a02a54\r\nAccept-Language: en-US,en;q=0.5\r\n\r\n{"password": "123456", "deviceid": "53c835787f4d2406131955633c1842d0", "username": "ABCDEF@gmail.com", "persistent": "on"}'
reply: 'HTTP/1.1 200 OK\r\n'
header: Server: Apache/2.4.10 (Debian)
header: Cache-Control: max-age=300
header: Expires: Fri, 09 Dec 2016 12:32:43 GMT
header: Content-Type: application/json
header: Last-Modified: Fri, 09 Dec 2016 12:27:43 GMT
header: Content-Length: 51
header: Accept-Ranges: bytes
header: Date: Fri, 09 Dec 2016 12:27:43 GMT
header: Connection: keep-alive
header: X-Pw-Hits: 0
<response_seek_wrapper at 0x7f650b587f80 whose wrapped object = <closeable_response at 0x7f650b50c638 whose fp = <socket._fileobject object at 0x7f650e597cd0>>>
{u'error_number': u'1', u'error_message': u'Login failed'}
Any pointers??

Thanks,

Leo
leo738 is offline   Reply With Quote
Old 12-09-2016, 07:35 AM   #13
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Just noticed that the POST from the Irish Times is using:

Content-Type: application/x-www-form-urlencoded; charset=UTF-8

Whereas the WSJ uses:

Content-Type: application/json

So looks like I shouldn't be using json stuff!

How do I add it instead??
leo738 is offline   Reply With Quote
Old 12-10-2016, 02:56 PM   #14
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
Found an example of a similar login (available on github repo):

calibre/recipes/hbr.recipe

Code:
  rq = Request(rurl, headers={
            'Accept': '*/*',
            'Accept-Language': 'en-US,en;q=0.5',
            'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
            'Referer': url,
            'X-Requested-With': 'XMLHttpRequest',
        },  data=urlencode({'username': self.username, 'password': self.password,'deviceid':deviceid, 'persistent':'on'}))
Looks like it's working now, will get it tidied up & then submit it.

Regards,

Leo
leo738 is offline   Reply With Quote
Old 12-11-2016, 03:43 PM   #15
leo738
Enthusiast
leo738 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Jul 2011
Device: Kindle 3
I've put together an improved recipe but still having issues. It successful handles the sigin however when it starts downloading the articles (via RSS) it returns:

Code:
header: X-Pw-Access: anonymous,subscribers.p_1_2901997.news.1..aac.1.1.5
Not idea how to proceed from here!

I attach the recipe for anyone interested.

Leo
Attached Files
File Type: zip The Irish Times.recipe.zip (1.7 KB, 148 views)
leo738 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Irish Times - Paywall erected leo738 Recipes 2 07-10-2016 03:04 AM
Updated Irish Times recipe? leo738 Recipes 10 04-01-2013 08:13 AM
Irish Times - Recipe Problem leo738 Recipes 10 08-31-2011 12:15 PM
Irish Times Recipe problem mbro Recipes 3 04-16-2011 08:11 AM
Modified Irish Times Recipe phiznlil Recipes 2 04-01-2011 06:27 AM


All times are GMT -4. The time now is 11:59 AM.


MobileRead.com is a privately owned, operated and funded community.