Receipe request for Calibre

indianinva · 01-21-2013, 03:01 PM

First of all,

Folks at Calibre, thank you so much for this awesome software. Second, I want to request recipe for Proquest. Proquest hosts tons of online sources, some of which are very expensive otherwise, e.g. Wall Street Journal.

Is it possible to obtain wsj from Proquest using Calibre. I have written some code which does that (sort of). But my code uses Selenium (partly since I do not understand the authentication mechanism all that well).

Here is my code

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

def getnext():
driver.find_element_by_link_text("Next page").click()
driver.find_element_by_id("mlcbAll").click()
return

url='http://search.proquest.com.proxy.xx.edu/publication/10482'
driver = webdriver.Firefox()
driver.get(url)
driver.find_element_by_id("UserIDinput").clear()
driver.find_element_by_id("UserIDinput").send_keys ("xxx")
driver.find_element_by_css_selector("input[type=\"submit\"]").click()
driver.find_element_by_id("passwordInput").clear()
driver.find_element_by_id("passwordInput").send_ke ys('xxxxxx')
driver.find_element_by_css_selector("input[type=\"submit\"]").click()
driver.find_element_by_link_text("View most recent issue").click()
driver.find_element_by_id("mlcbAll").click()
errorcode=0
while errorcode ==0:
try:
getnext()
except:
errorcode=1
driver.find_element_by_id("saveExportLink").click( )
##Introduce wait here
time.sleep(100)
el = driver.find_element_by_name("exportMode")
for option in el.find_elements_by_tag_name('option'):
if option.text == 'HTML':
option.click()
driver.find_element_by_id("submitButton").click()

qikstart · 01-14-2019, 07:06 AM

Thanks for sharing your script. I updated it somewhat to work with the current version of the site. I have not looked into a Calibre extension, but it's a good idea if it is possible.

Code:

#!/usr/bin/python

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

def getnext():
	driver.find_element_by_link_text("Next page").click()
	driver.find_element_by_id("mlcbAll").click()
	return


url='https://search.proquest.com/publication.pubfull:searchmostrecentissue?t:ac=publications_10482'

# https://search.proquest.com/publication/54814/
driver = webdriver.Firefox()
driver.get(url)
time.sleep(3)
driver.get(url)
##Introduce wait here
time.sleep(5)
print("hello")
el = driver.find_element_by_name("itemsPerPage")

for option in el.find_elements_by_tag_name('option'):
	if option.text == '100':
		option.click()

driver.find_element_by_id("mlcbAll").click()
errorcode=0

while errorcode ==0:
	try:
		getnext()
	except:
		errorcode=1

driver.find_element_by_id("tsMore").click()
driver.find_element_by_id("saveExportLink_1").click()


driver.find_elements_by_xpath("//*[contains(@id, 'submitButton')]")[0].click()

01-21-2013, 03:01 PM	#1
indianinva Junior Member Posts: 1 Karma: 10 Join Date: Jan 2013 Device: Kindle	Receipe request for Calibre First of all, Folks at Calibre, thank you so much for this awesome software. Second, I want to request recipe for Proquest. Proquest hosts tons of online sources, some of which are very expensive otherwise, e.g. Wall Street Journal. Is it possible to obtain wsj from Proquest using Calibre. I have written some code which does that (sort of). But my code uses Selenium (partly since I do not understand the authentication mechanism all that well). Here is my code import time from selenium import webdriver from selenium.webdriver.common.keys import Keys def getnext(): driver.find_element_by_link_text("Next page").click() driver.find_element_by_id("mlcbAll").click() return url='http://search.proquest.com.proxy.xx.edu/publication/10482' driver = webdriver.Firefox() driver.get(url) driver.find_element_by_id("UserIDinput").clear() driver.find_element_by_id("UserIDinput").send_keys ("xxx") driver.find_element_by_css_selector("input[type=\"submit\"]").click() driver.find_element_by_id("passwordInput").clear() driver.find_element_by_id("passwordInput").send_ke ys('xxxxxx') driver.find_element_by_css_selector("input[type=\"submit\"]").click() driver.find_element_by_link_text("View most recent issue").click() driver.find_element_by_id("mlcbAll").click() errorcode=0 while errorcode ==0: try: getnext() except: errorcode=1 driver.find_element_by_id("saveExportLink").click( ) ##Introduce wait here time.sleep(100) el = driver.find_element_by_name("exportMode") for option in el.find_elements_by_tag_name('option'): if option.text == 'HTML': option.click() driver.find_element_by_id("submitButton").click()

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Baltimore Sun Receipe Fails	tlchost	Recipes	5	01-29-2013 01:38 PM
Malkin Receipe Fails	tlchost	Recipes	0	01-18-2013 02:22 PM
minor modified Foreign Affairs receipe	forceps	Recipes	3	03-06-2012 10:43 PM
New receipe: novinky.cz - czech news portal	latal.tomas	Recipes	2	04-30-2011 09:16 AM
Receipe Request: Media Guardian	SteveMW	Recipes	5	02-08-2011 02:18 PM

Advert