Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 01-21-2013, 03:01 PM   #1
indianinva
Junior Member
indianinva began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jan 2013
Device: Kindle
Receipe request for Calibre

First of all,

Folks at Calibre, thank you so much for this awesome software. Second, I want to request recipe for Proquest. Proquest hosts tons of online sources, some of which are very expensive otherwise, e.g. Wall Street Journal.

Is it possible to obtain wsj from Proquest using Calibre. I have written some code which does that (sort of). But my code uses Selenium (partly since I do not understand the authentication mechanism all that well).

Here is my code

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

def getnext():
driver.find_element_by_link_text("Next page").click()
driver.find_element_by_id("mlcbAll").click()
return


url='http://search.proquest.com.proxy.xx.edu/publication/10482'
driver = webdriver.Firefox()
driver.get(url)
driver.find_element_by_id("UserIDinput").clear()
driver.find_element_by_id("UserIDinput").send_keys ("xxx")
driver.find_element_by_css_selector("input[type=\"submit\"]").click()
driver.find_element_by_id("passwordInput").clear()
driver.find_element_by_id("passwordInput").send_ke ys('xxxxxx')
driver.find_element_by_css_selector("input[type=\"submit\"]").click()
driver.find_element_by_link_text("View most recent issue").click()
driver.find_element_by_id("mlcbAll").click()
errorcode=0
while errorcode ==0:
try:
getnext()
except:
errorcode=1
driver.find_element_by_id("saveExportLink").click( )
##Introduce wait here
time.sleep(100)
el = driver.find_element_by_name("exportMode")
for option in el.find_elements_by_tag_name('option'):
if option.text == 'HTML':
option.click()
driver.find_element_by_id("submitButton").click()
indianinva is offline   Reply With Quote
Old 01-14-2019, 07:06 AM   #2
qikstart
Junior Member
qikstart began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jan 2019
Device: kindle paperwhite
Arrow Updated Script

Thanks for sharing your script. I updated it somewhat to work with the current version of the site. I have not looked into a Calibre extension, but it's a good idea if it is possible.

Code:
#!/usr/bin/python

import time
from selenium import webdriver
from selenium.webdriver.common.keys import Keys

def getnext():
	driver.find_element_by_link_text("Next page").click()
	driver.find_element_by_id("mlcbAll").click()
	return


url='https://search.proquest.com/publication.pubfull:searchmostrecentissue?t:ac=publications_10482'

# https://search.proquest.com/publication/54814/
driver = webdriver.Firefox()
driver.get(url)
time.sleep(3)
driver.get(url)
##Introduce wait here
time.sleep(5)
print("hello")
el = driver.find_element_by_name("itemsPerPage")

for option in el.find_elements_by_tag_name('option'):
	if option.text == '100':
		option.click()

driver.find_element_by_id("mlcbAll").click()
errorcode=0

while errorcode ==0:
	try:
		getnext()
	except:
		errorcode=1

driver.find_element_by_id("tsMore").click()
driver.find_element_by_id("saveExportLink_1").click()


driver.find_elements_by_xpath("//*[contains(@id, 'submitButton')]")[0].click()
qikstart is offline   Reply With Quote
Advert
Reply

Tags
proquest, receipe, request, wsj


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Baltimore Sun Receipe Fails tlchost Recipes 5 01-29-2013 01:38 PM
Malkin Receipe Fails tlchost Recipes 0 01-18-2013 02:22 PM
minor modified Foreign Affairs receipe forceps Recipes 3 03-06-2012 10:43 PM
New receipe: novinky.cz - czech news portal latal.tomas Recipes 2 04-30-2011 09:16 AM
Receipe Request: Media Guardian SteveMW Recipes 5 02-08-2011 02:18 PM


All times are GMT -4. The time now is 11:15 PM.


MobileRead.com is a privately owned, operated and funded community.