View Single Post
Old 09-03-2011, 07:44 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,414
Karma: 27757236
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Try this

preprocess_regexps= [(re.compile(r'<!DOCTYPE[^>]+>', re.I), '')]

and note that you can also define preprocess_raw_html() i your recipe to remove the doctype programmitacally if you have trouble with regeps.
kovidgoyal is online now   Reply With Quote