MobileRead Forums - View Single Post - Bad DOCTYPE declaration causes BS to crash

kovidgoyal · 09-04-2011, 04:01 PM

Sorry, if you mean the index page as in the page used in parse_index, then no it doesn't apply. In that case you have to do it manually.

Code:

raw = self.index_to_soup(index_url, raw=True)
raw = re.sub(r'(?i)<!DOCTYPE[^>]+>', '', raw)
soup = self.index_to_soup(raw)

09-04-2011, 04:01 PM	#7
kovidgoyal creator of calibre Posts: 45,681 Karma: 28549304 Join Date: Oct 2006 Location: Mumbai, India Device: Various	Sorry, if you mean the index page as in the page used in parse_index, then no it doesn't apply. In that case you have to do it manually. Code: raw = self.index_to_soup(index_url, raw=True) raw = re.sub(r'(?i)<!DOCTYPE[^>]+>', '', raw) soup = self.index_to_soup(raw)