Thread: HTML5 parsing
View Single Post
Old 08-08-2012, 11:03 AM   #3
nickredding
onlinenewsreader.net
nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'nickredding knows the difference between 'who' and 'whom'
 
Posts: 328
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
Thanks. There is another solution that doesn't involve running calibre from source (since that is the only way to extend BeautifulSoup as far as I can see) and that is to change the HTML5 tags to DIV with preprocess_regexps. This is done in _postprocess_html anyway but at that point it's too late since BeautifulSoup has already "fixed" the HTML.
nickredding is offline   Reply With Quote