View Single Post
Old 09-01-2010, 09:17 AM   #2587
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonytheBookworm View Post
The reason I had done the extra for loop was to snag the title ....
When I get a chance, I'll try to look this over, but as I'm sure you are aware, tehre's no substitute for a careful look at the structure of the page you are scraping. If an extra for loop works for you, that's fine.
Quote:
also. i'm noticing that there are <span> tags inside the <p> tags so when i do for a search for the <a> inside the <p> i get the dang links for the ads instead of the last <a> tag... this one i tell you is really working the brain.
Again, there may be a better way to locate your <a> tag by carefully studying the source page structure, but if you don't see one, then you can simply test each <a> tag you find. You can check to see if the <a> tag is embedded in a <span> tag using the "parent" test of Beautiful Soup. If the parent of the <a> tag is a span tag, skip it, and search again to get the second <a> tag, etc.

I'll leave you to play with that. I'm sure a closer look at your code and the page you're scraping would let me make better comments, but I'm short on time today. Good Luck!
Starson17 is offline