View Single Post
Old 07-19-2014, 07:35 AM   #725
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,172
Karma: 148951761
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Rev. Bob View Post
Now that I've had a chance to pick up davidfor's mods and address the bugs, here's a new version. This should properly handle Unicode and nested empty elements, as well as fixing the bugs. (Most pernicious bug: <div>blah<div class="x"/></div> would become <div>blah<div class="x"//> instead of properly remaining untouched!)

Stripspans also has a new tweak: it de-indents the <head> element and all the usual elements inside it, as well as the <body> element (but not its contents). That's purely a stylistic preference on my part, and I should note that it does not de-indent the contents of <style> elements - just the <style> and </style> lines themselves. Stripspans also keeps going over each document until it comes back unchanged, so there can be any number of contentless elements nested inside each other, and stripspans will catch them all. I tested this with a big string of nested bold and italic elements, at least a dozen levels deep, and stripspans removed 'em all.

I've said it before, but hopefully this will be the last time: I've attached the latest version, I think I've got the bugs worked out, and if nobody else can find one - let's say, by July 31, just before I have surgery - then I'll hand it off to kiwidude on August 1. That's about two weeks for testing; feedback and beta reports are greatly appreciated.

And yes, although the filename says "for qt5", it's also good for earlier versions.
I think I've found a slight bug. The plugin's output says it's stripping spans when there are no spans to strip. It should not be outputting that it's removing what's not there to remove.

Code:
Modify ePubs
Logfile for book ID 931 (The Hundred Thousand Kingdoms / N. K. Jemisin)
931
  Modifying:  C:\Users\Jon\AppData\Local\Temp\calibre_nivdok\itcjvh_modify_epub\931.epub
Parsing xml file: OEBPS/HundredThousan_opf.opf
Parsing xml file: OEBPS/HundredThousan.ncx
	Looking for files to remove: [u'iTunesMetadata.plist', u'iTunesArtwork']
	Looking for files to remove: [u'META-INF/calibre_bookmarks.txt']
	Looking for files to remove: [u'.DS_Store', u'thumbs.db']
	Looking for unused images
	Looking for broken links in the NCX
	Looking for Adobe xpgt files and links to remove
	Looking for Adobe DRM meta tags to remove
	  Removed meta tag from: OEBPS/HundredThousan_chap-24.html
	  Removed meta tag from: OEBPS/HundredThousan_teas-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-3.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-20.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-13.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-11.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-19.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-15.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-22.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-18.html
	  Removed meta tag from: OEBPS/HundredThousan_auth-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-12.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-10.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-4.html
	  Removed meta tag from: OEBPS/HundredThousan_adca-1.html
	  Removed meta tag from: OEBPS/HundredThousan_appe-2.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-5.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-7.html
	  Removed meta tag from: OEBPS/HundredThousan_blur-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-2.html
	  Removed meta tag from: OEBPS/HundredThousan_copy.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-23.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-21.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-16.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-1.html
	  Removed meta tag from: OEBPS/HundredThousan_ackn-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-29.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-9.html
	  Removed meta tag from: OEBPS/HundredThousan_toc.html
	  Removed meta tag from: OEBPS/HundredThousan_appe-4.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-28.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-27.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-25.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-8.html
	  Removed meta tag from: OEBPS/HundredThousan_foot-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-26.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-14.html
	  Removed meta tag from: OEBPS/HundredThousan_appe-1.html
	  Removed meta tag from: OEBPS/HundredThousan_appe-3.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-6.html
	  Removed meta tag from: OEBPS/HundredThousan_afte-1.html
	  Removed meta tag from: OEBPS/HundredThousan_chap-17.html
	  Removed meta tag from: OEBPS/cover.xml
	Looking for all jackets
	Looking for inline javascript blocks to remove
	Looking for .js files to remove
	Stripping spans
	  Stripped spans in: OEBPS/HundredThousan_chap-24.html
	  Stripped spans in: OEBPS/HundredThousan_teas-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-3.html
	  Stripped spans in: OEBPS/HundredThousan_chap-20.html
	  Stripped spans in: OEBPS/HundredThousan_chap-13.html
	  Stripped spans in: OEBPS/HundredThousan_chap-11.html
	  Stripped spans in: OEBPS/HundredThousan_chap-19.html
	  Stripped spans in: OEBPS/HundredThousan_chap-15.html
	  Stripped spans in: OEBPS/HundredThousan_chap-22.html
	  Stripped spans in: OEBPS/HundredThousan_chap-18.html
	  Stripped spans in: OEBPS/HundredThousan_auth-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-12.html
	  Stripped spans in: OEBPS/HundredThousan_chap-10.html
	  Stripped spans in: OEBPS/HundredThousan_chap-4.html
	  Stripped spans in: OEBPS/HundredThousan_adca-1.html
	  Stripped spans in: OEBPS/HundredThousan_appe-2.html
	  Stripped spans in: OEBPS/HundredThousan_chap-5.html
	  Stripped spans in: OEBPS/HundredThousan_chap-7.html
	  Stripped spans in: OEBPS/HundredThousan_blur-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-2.html
	  Stripped spans in: OEBPS/HundredThousan_copy.html
	  Stripped spans in: OEBPS/HundredThousan_chap-23.html
	  Stripped spans in: OEBPS/HundredThousan_chap-21.html
	  Stripped spans in: OEBPS/HundredThousan_chap-16.html
	  Stripped spans in: OEBPS/HundredThousan_chap-1.html
	  Stripped spans in: OEBPS/HundredThousan_ackn-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-29.html
	  Stripped spans in: OEBPS/HundredThousan_chap-9.html
	  Stripped spans in: OEBPS/HundredThousan_toc.html
	  Stripped spans in: OEBPS/HundredThousan_appe-4.html
	  Stripped spans in: OEBPS/HundredThousan_chap-28.html
	  Stripped spans in: OEBPS/HundredThousan_chap-27.html
	  Stripped spans in: OEBPS/HundredThousan_chap-25.html
	  Stripped spans in: OEBPS/HundredThousan_chap-8.html
	  Stripped spans in: OEBPS/HundredThousan_foot-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-26.html
	  Stripped spans in: OEBPS/HundredThousan_chap-14.html
	  Stripped spans in: OEBPS/HundredThousan_appe-1.html
	  Stripped spans in: OEBPS/HundredThousan_appe-3.html
	  Stripped spans in: OEBPS/HundredThousan_chap-6.html
	  Stripped spans in: OEBPS/HundredThousan_afte-1.html
	  Stripped spans in: OEBPS/HundredThousan_chap-17.html
	  Stripped spans in: OEBPS/cover.xml
ePub updated in 2.03 seconds
JSWolf is offline   Reply With Quote