Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 05-07-2009, 03:55 PM   #1
kbfprivate
Junior Member
kbfprivate began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
Downloading previous issues of Newsweek

I have been terribly busy the last few weeks and haven't had a chance to even crack open a Newsweek in the last month. How hard is it to modify the Newsweek download script to download the last 4 issues?

I downloaded this week's issue and it looks great

Thanks!
Noah
kbfprivate is offline   Reply With Quote
Old 05-07-2009, 04:59 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Shouldn't be too bad if the newsweek website has links to back issues
kovidgoyal is offline   Reply With Quote
Advert
Old 05-07-2009, 05:20 PM   #3
kbfprivate
Junior Member
kbfprivate began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
It looks like all I need to do is modify this code:

120
def get_current_issue(self):
121 2598
#from urllib2 import urlopen # For some reason mechanize fails
122
#home = urlopen('http://www.newsweek.com').read()
123
soup = self.index_to_soup('http://www.newsweek.com')#BeautifulSoup(home)
124 1182
img = soup.find('img', alt='Current Magazine')
125
if img and img.parent.has_key('href'):
126 2598
return self.index_to_soup(img.parent['href'])

Can I chane "return self.index_to_soup(img.parent['href'])" to be the URL of a previous issue and then re-run the script?

Thanks!
Noah
kbfprivate is offline   Reply With Quote
Old 05-07-2009, 05:21 PM   #4
kbfprivate
Junior Member
kbfprivate began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
Actually can't I just comment out everything and just have a return statement? I don't know the comment character
kbfprivate is offline   Reply With Quote
Old 05-07-2009, 05:23 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,251
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
yes you can and the comment character is #
kovidgoyal is offline   Reply With Quote
Advert
Old 05-07-2009, 11:53 PM   #6
kbfprivate
Junior Member
kbfprivate began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
Quote:
Originally Posted by kovidgoyal View Post
yes you can and the comment character is #
I tried to change it to:

def get_current_issue(self):
#from urllib2 import urlopen # For some reason mechanize fails
#home = urlopen('http://www.newsweek.com').read()
#soup = self.index_to_soup('http://www.newsweek.com/id/195141')#BeautifulSoup(home)
#img = soup.find('img', alt='Current Magazine')
#if img and img.parent.has_key('href'):
return 'http://www.newsweek.com/id/195141'

But it gives me:

Job: **Fetch news from Newsweek20090504**
**tuple**: ('TypeError', u'find() takes no keyword arguments')
**Traceback**:
Traceback (most recent call last):
File "parallel.py", line 958, in worker
File "parallel.py", line 916, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 66, in main
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 37, in convert
File "calibre\web\feeds\main.pyo", line 152, in run_recipe
File "calibre\web\feeds\news.pyo", line 567, in download
File "calibre\web\feeds\news.pyo", line 691, in build_index
File "c:\docume~1\admini~1\locals~1\temp\calibre_0.5.10 _l6nhsr_recipes\recipe0.py", line 78, in parse_index
TypeError: find() takes no keyword arguments

**Log**:
('TypeError', u'find() takes no keyword arguments')
Traceback (most recent call last):
File "parallel.py", line 958, in worker
File "parallel.py", line 916, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 66, in main
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _feeds.py", line 37, in convert
File "calibre\web\feeds\main.pyo", line 152, in run_recipe
File "calibre\web\feeds\news.pyo", line 567, in download
File "calibre\web\feeds\news.pyo", line 691, in build_index
File "c:\docume~1\admini~1\locals~1\temp\calibre_0.5.10 _l6nhsr_recipes\recipe0.py", line 78, in parse_index
TypeError: find() takes no keyword arguments

Any ideas?
kbfprivate is offline   Reply With Quote
Old 05-07-2009, 11:58 PM   #7
kbfprivate
Junior Member
kbfprivate began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2009
Device: ipod touch
Nevermind, should have been:

return self.index_to_soup('http://www.newsweek.com/id/195141')

Seems to work, except gets current cover, but that is easily fixable. Thanks for a great program!

-Noah
kbfprivate is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Remove <previous next> from html schizopolis Calibre 18 11-18-2010 09:46 PM
PRS-600 New firmware for previous readers? wolfing Sony Reader 51 09-06-2010 06:01 PM
Page goes back to previous haino More E-Book Readers 2 07-04-2010 04:04 PM
Going back to previous firmware jusmee Astak EZReader 31 03-21-2010 11:53 AM
IBSuite v0.1 (previous pi) caritas Workshop 0 04-05-2009 10:48 AM


All times are GMT -4. The time now is 09:20 AM.


MobileRead.com is a privately owned, operated and funded community.