Thread: web2lrf
View Single Post
Old 01-09-2008, 10:32 PM   #124
secretsubscribe
Enthusiast
secretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to beholdsecretsubscribe is a marvel to behold
 
Posts: 26
Karma: 11777
Join Date: Jun 2007
Location: Brooklyn
Device: PRS-500,Treo 750, Archos 605 Wifi
Profile for the TheNation.com

Hello
I'm in the process of developing a profile to log in and download articles from thenation.com.
The Nation doesn't have an RSS feed for their monthly articles. They have feeds for Most Emailed, Top Stories, etc.. But I want to download the current month's "Magazine."
What's helpful is that they the month's articles (those included in print AND web only articles) are located @ http://www.thenation.com/issue/YYYYMMDD
The individual articles are located at http://www.thenation.com/doc/YYYYMMDD/author_name.

So I was able to scrape out all the urls for for the articles.
Then in trying to figure out what to do next, I decided to take those URLs and create an rss xml file on my local drive (c:\program files\libprs500\nation.xml),
that i then returned at the end of the profile:
return [('feed1','file:///c:/program%20files/libprs500/nation.xml')]

I worked!
Now i need figure out how to extract the article titles and descriptions and make the proper replacements to get the print versions of the articles instead.

But the main reason I'm posting it to ask if creating and accessing the local rss file is the way to go. This would be a lot more convinient to anyone interested if the profile script didn't have to worry about generating files and directory structures.
Just started to take a look at this a few days ago and its the first time I try my hand at python so thanks for any help in advance.
secretsubscribe is offline   Reply With Quote