MobileRead Forums - View Single Post - "preprocess_regexps = [(re.compile..." bugged?

Starson17 · 11-01-2011, 02:46 PM

Quote:

Originally Posted by scissors

Hi Chaps.

Can someone confirm that

preprocess_regexps = [
(re.compile(r'<head>.*</head>', re.IGNORECASE | re.DOTALL), lambda match: '<head></head>')]

OR

preprocess_regexps = [
(re.compile(r'<head>.*?</head>', re.IGNORECASE | re.DOTALL), lambda match: '<head></head>')]

Should totally remove a downloaded pages <head> section.

Not necessarily. The head tag might have attributes. I'd have checked to be sure, but this should work:

Code:

preprocess_regexps = [(re.compile(r'<head.*</head>', re.IGNORECASE | re.DOTALL), lambda match: '<head></head>')]