There are lots of invalid comments in that raw html, for example,
Code:
<!--[if lte IE 8]>
<div data-module-id="6" data-module-name="article.app/lib/module/ieWarning" data-module-zone="ie_warning" class="zonedModule">
<div class="ie-flag">
<button class="ie-button"></button>
<div class="ie-warning-wrapper">
<p><span id="warning-label">BROWSER UPDATE</span> To gain access to the full experience, please upgrade your browser: </p>
<ul>
<li><a href="https://www.google.com/intl/en_us/chrome/browser/">Chrome</a> | </li>
<li><a href="http://support.apple.com/downloads/#safari">Safari</a> | </li>
<li><a href="https://www.mozilla.org/en-US/firefox/new/">Firefox</a> | </li>
<li><a href="http://windows.microsoft.com/en-us/internet-explorer/download-ie">Internet Explorer</a></li>
</ul><br>
<p><span id="warning-note">Note: If you are running Internet Explorer 9 and above, make sure it is not in compatibility mode</span></p>
</div>
</div>
</div> <!-- data-module-name="article.app/lib/module/ieWarning" -->
<![endif]-->
Note the improper nesting of comments. And then this:
Code:
<![if ! lte IE 8]>
<span class="image-enlarge">
ENLARGE
</span>
<![endif]>
The following should take care of it:
Code:
preprocess_regexps = [
(re.compile(r'<!--\[if lte IE 8\]>.+?<!\[endif\]-->', re.DOTALL), lambda m: ''),
(re.compile(r'<!\[if ! lte IE 8\]>.+?<!\[endif\]>', re.DOTALL), lambda m:''),
]