Thread: web2lrf
View Single Post
Old 02-21-2008, 06:09 PM   #177
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 274
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Get Full WSJ?

Hi guys.

I'm using the WSJ profile and it works very well (thanks to JTravers for the profile).

I have a quick question: is is possible to get all the articles from a page, not from a feed? RSS feed for "Today's Newspaper" has only 5 articles from front page plus few more from other sections. I'd like to get as many articles from printed edition ("http://online.wsj.com/page/2_0133.html") as possible.

I replaced an existing link with this one, but got a blank page:
def get_feeds(self):
return [
(' Today\'s Newspaper - All', 'http://online.wsj.com/page/2_0133.html'),
## (' Today\'s Newspaper - Page One', 'http://online.wsj.com/xml/rss/3_7205.xml'),
]

Any advise? I want all the links from "http://online.wsj.com/page/2_0133.html" page that have "article" in their address. I don't think I need to change the clean-up part, current profile all the work.

This must be a simple question for Kovid, JTravers and others who have created their profiles.

Thanks in advance,
David
ddavtian is offline   Reply With Quote