View Single Post
Old 08-29-2010, 04:21 PM   #2556
TonytheBookworm
Addict
TonytheBookworm is on a distinguished road
 
TonytheBookworm's Avatar
 
Posts: 264
Karma: 62
Join Date: May 2010
Device: kindle 2, kindle 3, Kindle fire
Quote:
Originally Posted by Starson17 View Post
Let's start at:



'title' : article title,
'url' : URL of article
'date' : The publication date of the article as a string,
'description' : A summary of the article

I suggest you search the recipes for "parse_index." There are dozens of examples of how this is done.
I'm looking at The Atlantic and I have a general Idea of this.. I'm kinda stuff though. I'm trying to get the Section title
Spoiler:
Code:
<div class="right"><h2><a href="/blogs/whitetail-365">Whitetail365</a><a href="http://feeds.feedburner.com/Whitetail365" class="rss"><img src="/misc/feed.png"></a></h2><p class="description">All Deer, All the Time from Whitetails columnist Scott Bestul</p><h3>Recent Posts</h3><div class="item-list"><ul><li class="first even"><a href="/blogs/hunting/2010/08/bestul-transition-time">Bestul: Transition Time</a></li>
<li class=" odd"><a href="/blogs/hunting/2010/08/hurteau-write-caption-win-deer-call">Hurteau: Write a Caption, Win a Deer Call </a></li>
<li class="last even"><a href="/blogs/hunting/2010/08/guest-blog-5-reasons-plant-food-plots-now">Guest Blog: 5 Reasons To Plant Food Plots Now </a></li>
</ul></div><a href="/blogs/whitetail-365">Read All Posts</a></div>

In the example they used
Code:
sectit = soup.find('h1', attrs={'class':'sectionTitle'})
so I understand that that is looking for all h1 tags with a class=sectionTitle
but in my case I only have a href inside the h2 tags. sorry for all the questions just trying to learn

Last edited by TonytheBookworm; 08-29-2010 at 04:37 PM.
TonytheBookworm is offline