View Single Post
Old 01-21-2011, 05:34 PM   #1
spedinfargo
Groupie
spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.spedinfargo is the king of the Divan.
 
Posts: 155
Karma: 106422
Join Date: Nov 2010
Device: none
For Testing: Roger Ebert (movie reviews) Recipe

Felt like a good afternoon to learn Python so I threw together a Roger Ebert recipe. Feel free to pull down and give me some feedback...

A few notes:
1) There was no good RSS feed (there is one but it's terrible) so I had to go the parse_index route.

2) The HTML is kind of a mess so I couldn't figure out a good way to use BeautifulSoup - so the regex's are kind of messy. Hopefully they hold up.

3) I'm getting some strange characters in some of the articles - I don't know if this has to do with encoding or what's the deal there.

4) Roger spends a ton of time on his Blog lately. I want to pull that in eventually but there isn't a printer-friendly version of any of his posts. Some of his web site is pretty much abandoned (esp. movie answer man) and sometimes they link to his blog posts from the main site - I tried to filter those out but once in a while you'll see a title of "Ebert Journal Post" with only an intro paragraph. When I incorporate his blog posts into the recipe this will hopefully go away...

Download on the next message in this thread...
spedinfargo is offline   Reply With Quote