![]() |
#1 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
|
Downloading previous issues of Newsweek
I have been terribly busy the last few weeks and haven't had a chance to even crack open a Newsweek in the last month. How hard is it to modify the Newsweek download script to download the last 4 issues?
I downloaded this week's issue and it looks great ![]() Thanks! Noah |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Shouldn't be too bad if the newsweek website has links to back issues
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
|
It looks like all I need to do is modify this code:
120 def get_current_issue(self): 121 2598 #from urllib2 import urlopen # For some reason mechanize fails 122 #home = urlopen('http://www.newsweek.com').read() 123 soup = self.index_to_soup('http://www.newsweek.com')#BeautifulSoup(home) 124 1182 img = soup.find('img', alt='Current Magazine') 125 if img and img.parent.has_key('href'): 126 2598 return self.index_to_soup(img.parent['href']) Can I chane "return self.index_to_soup(img.parent['href'])" to be the URL of a previous issue and then re-run the script? Thanks! Noah |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
|
Actually can't I just comment out everything and just have a return statement? I don't know the comment character
![]() |
![]() |
![]() |
![]() |
#5 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
yes you can and the comment character is #
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
|
I tried to change it to:
def get_current_issue(self): #from urllib2 import urlopen # For some reason mechanize fails #home = urlopen('http://www.newsweek.com').read() #soup = self.index_to_soup('http://www.newsweek.com/id/195141')#BeautifulSoup(home) #img = soup.find('img', alt='Current Magazine') #if img and img.parent.has_key('href'): return 'http://www.newsweek.com/id/195141' But it gives me: Job: **Fetch news from Newsweek20090504** **tuple**: ('TypeError', u'find() takes no keyword arguments') **Traceback**: Traceback (most recent call last): File "parallel.py", line 958, in worker File "parallel.py", line 916, in work File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 66, in main File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 37, in convert File "calibre\web\feeds\main.pyo", line 152, in run_recipe File "calibre\web\feeds\news.pyo", line 567, in download File "calibre\web\feeds\news.pyo", line 691, in build_index File "c:\docume~1\admini~1\locals~1\temp\calibre_0.5.10 _l6nhsr_recipes\recipe0.py", line 78, in parse_index TypeError: find() takes no keyword arguments **Log**: ('TypeError', u'find() takes no keyword arguments') Traceback (most recent call last): File "parallel.py", line 958, in worker File "parallel.py", line 916, in work File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 66, in main File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 37, in convert File "calibre\web\feeds\main.pyo", line 152, in run_recipe File "calibre\web\feeds\news.pyo", line 567, in download File "calibre\web\feeds\news.pyo", line 691, in build_index File "c:\docume~1\admini~1\locals~1\temp\calibre_0.5.10 _l6nhsr_recipes\recipe0.py", line 78, in parse_index TypeError: find() takes no keyword arguments Any ideas? |
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
|
Nevermind, should have been:
return self.index_to_soup('http://www.newsweek.com/id/195141') Seems to work, except gets current cover, but that is easily fixable. Thanks for a great program! -Noah |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Remove <previous next> from html | schizopolis | Calibre | 18 | 11-18-2010 09:46 PM |
PRS-600 New firmware for previous readers? | wolfing | Sony Reader | 51 | 09-06-2010 06:01 PM |
Page goes back to previous | haino | More E-Book Readers | 2 | 07-04-2010 04:04 PM |
Going back to previous firmware | jusmee | Astak EZReader | 31 | 03-21-2010 11:53 AM |
IBSuite v0.1 (previous pi) | caritas | Workshop | 0 | 04-05-2009 10:48 AM |