this was actually a problem for me and a solution for you.
Code:
def preprocess_html(self, soup):
self.log('\t checking for subscriber only content')
denied = soup.findAll(True,text='Subscribers')
print denied
return soup
i think that might do the trick.
edit:
it will not do the trick. it will keep only the ones that need to be read as a subscriber. you need to inverse the find. in other words find a constant attribute in the other articles.
write back if you didnt get it (its late here and i am not thinking straight).