View Single Post
Old 12-17-2016, 08:29 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,535
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by slowsmile View Post
Using BeautifulSoup, here's a quick way to remove all garbage proprietary data from an html fille.
Nice example of deleting attributes from tags with bs4, but why would "id" or "lang" attributes be considered garbage (or proprietary)? Removing "id", for instance, could break a whole bunch of links in files (html toc and ncx included). Seems a very odd attribute to want to nuke ("name" should probably be converted to "id" to prevent any possible link breakage, as well).

Last edited by DiapDealer; 12-17-2016 at 08:34 AM.
DiapDealer is offline   Reply With Quote