View Single Post
Old 12-14-2011, 12:50 PM   #2
Barty
doofus
Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.Barty ought to be getting tired of karma fortunes by now.
 
Barty's Avatar
 
Posts: 2,513
Karma: 13036221
Join Date: Sep 2010
Device: Kobo Libra 2, Kindle Voyage
try using a preprocess regex

Code:
    preprocess_regexps = [
        (re.compile(r'<p><strong>MORE:</strong>.+?</p>', re.I|re.DOTALL), lambda x:''),
        ]
You can use just re.DOTALL instead of re.I|re.DOTALL if you know the case will always be exactly like that (re.I means ignore case).
Barty is offline   Reply With Quote