View Single Post
Old 11-08-2010, 03:56 PM   #4
marbs
Zealot
marbs began at the beginning.
 
Posts: 122
Karma: 10
Join Date: Jul 2010
Device: nook
this was actually a problem for me and a solution for you.

Code:
    def preprocess_html(self, soup):
           self.log('\t checking for subscriber only content')
           denied = soup.findAll(True,text='Subscribers')
           print denied
           return soup
i think that might do the trick.

edit:
it will not do the trick. it will keep only the ones that need to be read as a subscriber. you need to inverse the find. in other words find a constant attribute in the other articles.
write back if you didnt get it (its late here and i am not thinking straight).

Last edited by marbs; 11-08-2010 at 04:06 PM.
marbs is offline   Reply With Quote