Hey Kovid, always a pleasure to see you personally helping out, it's very impressive. Thanks!
Your input helped out and I could move to steps forward. I am now stuck at the login page. I had to put back the 'open the login page first' before moving forward with the downloading of the code itself.
I'm now stuck with the login itself, I have the following error messages
Code:
mechanize._mechanize.FormNotFoundError: no form matching name 'login'
referring to the page's source code, I've tried to use the following tags (in red):
Code:
<div class="entry-content">
<div class="login-form-container">
<form name="loginform" id="loginform" action="https://skepticalinquirer.org/wp-login.php" method="post">
<p class="login-username">
<label for="user_login">Email or Username</label>
<input type="text" name="log" id="user_login" class="input" value="" size="20">
</p>
<p class="login-password">
<label for="user_pass">Password</label>
<input type="password" name="pwd" id="user_pass" class="input" value="" size="20">
</p>
While writing this, I changed used the name of the form itself
loginform
And now have the following error message:
Code:
mechanize._form_controls.ControlNotFoundError: no control matching name 'EMAILADDRESS'
Here's the updated code as it stands now
Code:
import re, zipfile, os
from calibre.ptempfile import PersistentTemporaryDirectory
from calibre.ptempfile import PersistentTemporaryFile
from urllib.parse import urlparse, urlsplit
class TheSkepticalInquirer(BasicNewsRecipe):
title = u'The Skeptical Inquirer'
description = 'Investigation of fringe science and paranormal claims.'
language = 'en'
needs_subscription = True
def build_index(self):
br = self.get_browser()
if self.username is not None and self.password is not None:
br.open('https://skepticalinquirer.org/member-login/')
br.select_form(name='loginform')
br['HIDDEN'] = self.username
br['HIDDEN'] = self.password
br.submit()
return br
pdflink = br.find_link(text_regex=re.compile('https://skepticalinquirer.org/archive/'))
# Cheat calibre's recipe method, as in post from Starsom17
self.report_progress(0,_('downloading PDF'))
response = br.follow_link(pdflink)
dir = PersistentTemporaryDirectory()
pdf_file = PersistentTemporaryFile(suffix='.pdf',dir=dir)
pdf_file.write(response.read())
pdf_file.close()
# Get all formats into Calibre's database as one single book entry
self.report_progress(0.6,_('Adding files to Calibre db'))
cmd = "calibredb add -1 " + dir
os.system(cmd)
return index
Thanks again for your assistance, here !