View Single Post
Old 08-22-2011, 11:11 AM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by luis.nando View Post
I guess I have to change something on the soup.findAll() function.
Yes, you have to make soup.findAll() find the links to your articles. Teh code you posted is looking for tags that have class attributes of 'section-headline', 'story' or 'story headline'.

1) find the links on your pages.
2) Figure out how to identify them with BeautifulSoup
3) Use Python string handling to build the article links

If you have problems ask questions here, but start by finding each of the links to articles on your page that you want parse_index to identify, and figure out how to locate them all by tag name, class attribute, etc. If you explain in words, we can help you write the code.
Starson17 is offline   Reply With Quote