View Single Post
Old 12-14-2011, 12:50 PM   #2
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Posts: 1,921
Karma: 6924109
Join Date: Sep 2010
Device: Kindle 3, PW2, iPad 3, Samsung Glaxy Tab S
try using a preprocess regex

Code:
    preprocess_regexps = [
        (re.compile(r'<p><strong>MORE:</strong>.+?</p>', re.I|re.DOTALL), lambda x:''),
        ]
You can use just re.DOTALL instead of re.I|re.DOTALL if you know the case will always be exactly like that (re.I means ignore case).
Barty is offline   Reply With Quote