View Single Post
Old 10-08-2010, 03:00 PM   #15
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kidblue View Post
Is there an obvious example of the recipe you're describing - One where the page is extracted as opposed to using the browser?
Not that I know of, but it's easy to build:
1) let's grab the article url with print_version:
Code:
    def print_version(self, url):
        print 'print_v url is: ', url
        return url
By itself that does nothing, other than print the url of the article.

2) Now, let's turn it into a soup and print that:

Code:
    def print_version(self, url):
        print 'print_v url is: ', url
        soup = self.index_to_soup(url)
        print 'The soup is: ', soup
        return url
Again, this code does nothing other than print the article page source and the url of that page. You'd need to find what you want in the soup, and return that.

Edit: To be clear, I'm not saying this code will work in your case. If the article links are obfuscated, the browser method is needed.

Last edited by Starson17; 10-08-2010 at 03:05 PM.
Starson17 is offline   Reply With Quote