View Single Post
Old 08-10-2011, 04:22 PM   #18
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by yoss15 View Post
I would really appreciate it. For starters what does the for section in soup.findAll line do?
The job of parse_index is to look at a page and find links on that page to articles. The for section in soup.findAll is "finding all" tags that have a link in them to an article. More specifically, it's the beginning of that process. Do you know what a <div> tag is? The way that line works is it finds all tagged parts of the page that are tagged <div class="content">

I'll be nice and look at your page - hold on ....

There aren't any div tags like that.

You should probably be doing something like this:
Code:
for section in soup.findAll('li'):
Then something like:
Code:
for post in section.findAll('a', href=True):
That will find the <li> tags that have <a> tags inside with hrefs.
Starson17 is offline   Reply With Quote