View Single Post
Old 10-22-2011, 10:27 AM   #1
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
Posts: 4,246
Karma: 1445996
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
[GUI Plugin] Modify ePub

This plugin offers a way to perform certain modifications to your selected ePub files without performing a calibre conversion. This plugin was created a number of months ago and has a history documented in the Development forum on this thread.

Performing an ePub->ePub conversion will enforce a number of changes to your ePub, some of which can be undesirable for some users. Examples are the rewriting of CSS, margin modifications, file splitting in undesired places, changes to directory structure etc.

Instead this plugin allows a user specific subset of changes to be performed in isolation without otherwise touching the original ePub's file structure, CSS files etc. Frequently these changes have been performed manually by users either using the Tweak ePub feature (time consuming), by editing in Sigil (which introduces changes/side effects of its own), by doing ePub->ePub conversions, or by saving to disk and reimporting into calibre.

Users may also find it useful to install the Quality Check plugin, which offers the ability to identify ePubs in your library which qualify for many of the modifications this plugin can make.

Refer to the Help file accessed from the plugin dialog for full details on each of the modification options and when you might use them.

Main Features:
  • Remove iTunes artifact files
  • Remove Calibre bookmark files
  • Remove OS artifact files such as Thumbs.db
  • Remove unused image files
  • Removing missing file entries from the .opf manifest
  • Add unmanifested files to manifest
  • Remove unmanifested files from ePub
  • Remove non dc: metadata from manifest
  • Flatten TOC hierarchy in NCX file
  • Remove broken link TOC entries in NCX file
  • Remove margins from Adobe .xpgt files
  • Remove Adobe .xpgt files and links
  • Remove Adobe resource DRM meta tags
  • Remove all metadata jackets
  • Remove legacy metadata jackets
  • Add/replace metadata jacket
  • Encode HTML in UTF-8 to fix invalid HTML encodings.
  • Remove embedded fonts
  • Modify @page and body margin styles
  • Append extra CSS to each .css file
  • Smarten punctuation
  • Remove inline javascript and .js files
  • Remove html pages containing nothing but broken image links
  • Completely remove an existing cover
  • Insert a new cover or replace an existing one using your desired proportions/svg choice from Preferences->Output Options->EPUB
  • Update metadata
  • Save and restore your preferred settings with a single click
  • Optional script to run from command line

Special Notes:
  • Requires Calibre 0.8.53 or later

Installation Notes:
  1. Download the attached zip file and install the plugin/restart Calibre/add to context menu as described in the Introduction to plugins thread.

Running from command line:
  • It is possible to run most of the functions of this plugin from the command line using a python script that is bundled inside the zip file. This allows users to use a feature such as smarten punctuation against an ePub without having to add it to the calibre GUI. Note that the command line script still requires that calibre and the Modify ePub plugin are installed as per usual - it just avoids requiring the books to be added to calibre and interactive gui clicks.
  • To make use of the command line script, open the zip file and extract the file along with the readme.txt and follow the instructions within.

Paypal Donations:
  • If you find this or any of my other plugins useful please feel free to show your appreciation. I have spent many hundreds of unpaid hours in their development and support so any encouragement for me to continue is appreciated!

Version History:
Version 1.3.13 - 05 Jul 2015
Added option to disable the confirmation prompt each time to update the epub. Use at your own risk - if you make simultaneous other changes to the book record they may get lost.
Fix for Cancel on the progress dialog (submitted by Raśl)

Version 1.3.12 - 02 Oct 2014
Fixed minor bug in "stripkobo" option that missed some Kobo artifacts inside the HEAD element.
Fixed minor spacing bugs in "unpretty" option.
Enhancement to "stripkobo", "stripspans", and "unpretty" options: All three now remove </br> and </hr> tags and always make BR and HR self-closing elements. (This fixes invalid <br> and <hr> markup, if such is present.)
Moved "stripkobo", "stripspans", and "unpretty" into the "Known artifacts" category to balance the dialog box better.
Added some code to to make the dialog box scrollable on smaller screens.
Help file: Filled in how one can detect the need to smarten punctuation. (Was previously blank.)

Version 1.3.11 - 13 Aug 2014
Add a "stripspans" option to allow removal of attributeless <span> elements from markup, as well as normalizing empty <x></x> elements to the <x/> form.
Add a "stripkobo" option to allow removal of the Kobo-specific code from kepub books, transforming them into standard EPUB books. This does NOT remove Kobo's DRM.
Note: Both of the above will also completely remove A, B, I, U, BIG, SMALL, EM, SPAN, and STRONG elements from the markup when those elements have neither attributes nor content.
Add an "unpretty" option to de-indent and otherwise reformat HTML elements in markup. This should have no effect on the rendered content; it only cleans the source code up a bit.
Fix for "Remove Adobe resource DRM meta tags" option to remove leading spaces and/or newlines, so these meta tags are completely removed instead of leaving blank lines.

Version 1.3.10 - 28 Jul 2014
Support for upcoming calibre 2.0

Version 1.3.9 - 01 Sep 2013
Fix for users who do not have any Extra CSS in their defaults trying to use the Append Extra CSS option.

Version 1.3.8 - 30 Aug 2013
Add a "Append extra CSS" option to allow appending any css style information from Preferences->Common Options->Look & Feel->Extra CSS to each .css file in the ePub.
Respect the tweak "save_original_format_when_polishing" if set to make a .ORIGINAL_EPUB copy of the book before making modifications if no such copy exists.
After running Modify ePub ensure the book details panel is updated in case an ORIGINAL_EPUB was added
Fix for encrypted font ePubs being treated as DRM protected preventing Font removal

Version 1.3.7 - 15 Feb 2013
Fix for dependency on calibre code removed in 0.9.19

Version 1.3.6 - 09 Dec 2012
Fix for "Rewrite CSS margins" to ensure it only processes manifest xhtml files when replacing inline styles.

Version 1.3.5 - 22 Nov 2012
Add a separate script to allow Modify ePub to be run from the command line. Unzip it and refer to the readme.txt/script for help on how to use it.
Change to ensure when running via command line the lack of an opf file allows plugin to still run.

Version 1.3.4 - 16 Nov 2012
Workaround for calibre "bug" to ensure that if user has both remove javascript and smarten punctuation checked, that remove javascript runs first which ensures smarten punctuation will actually work correctly for quotes.

Version 1.3.3 - 08 Nov 2012
Fix the fix (for when Update metadata is "not" selected... sigh...

Version 1.3.2 - 08 Nov 2012
Fix regression from last release where only selecting the "Update metadata" option would not apply changes.

Version 1.3.1 - 06 Nov 2012
Ensure than the "Remove non dc: metadata" option will always run after "Update metadata" if both are selected.
Reorganise some of the layout and groups.

Version 1.3.0 - 04 Nov 2012
Add a "Encode HTML in UTF-8" option strip charset meta tags and re-encode in UTF-8 for books that do not display correctly in calibre viewer
Change the UI appearance to look more balanced.

Version 1.2.10 - 31 Aug 2012
Rewrite the playOrder to make sure it is an incremental sequence after actions that delete from the TOC.
Change indenting from mucking up self-closing tags in NCX.

Version 1.2.9 - 04 Jul 2012
Alter the "Proceed" message text to hopefully make it clearer to new users.
Fix "Rewrite CSS margins" bug where if default margins are set to zero and an epub has margins specified it would error
Fix "Rewrite CSS margins" bug where if default margins are set to zero it should not add an @page directive
Change "Rewrite CSS margins" so that if default margins are zero it writes out margin attributes with a value of zero, rather than removing them
Change "Rewrite CSS margins" so that if default margins are negative then it omits the margin attribute from the style
Enhance "Rewrite CSS margins" so that if CSS file has no content it is deleted from the epub
Rename "Rewrite CSS margins" to "Modify @page and body style margins"
Bug fix for "Remove unused images" not detecting svg images in an svg section containing sibling tags
Fix for "Remove Adobe xpgt links" so that it includes removal of links using the @import format.

Version 1.2.7 - 29 Jun 2012
When inserting covers, if guide points to a non-existent cover href, make sure the log does not error.
In the CSS margin updating, if adding page declaration at it to start rather than end of CSS file to workaround Sigil bug

Version 1.2.6 - 24 Jun 2012
Add buttons to save and restore the current settings, to allow setting your own easily switched to defaults

Version 1.2.5 - 15 Jun 2012
Bug fix for when using the Add/replace jacket and Insert/replace cover options together if book has no jacket currently

Version 1.2.4 - 05 Jun 2012
Add some non-standard guide types of "coverimagestandard" and "thumbimagestandard" to increase cover replacement coverage
If the guide has incorrect casing of an image href, auto-correct it

Version 1.2.3 - 05 Jun 2012
Further optimise the CSS margins feature to minimise which files get changed

Version 1.2.2 - 05 Jun 2012
Add a "Remove inline javascript and files" option to remove any javascript leftover from html conversions
Fix for CSS margins feature which was not always updating the css file in the epub after resetting margins

Version 1.2.1 - 01 Jun 2012
Fix for remove Adobe xpgt links so it no longer is dependent on link attribute ordering to find them

Version 1.2.0 - 01 Jun 2012
Change to require minimum calibre version 0.8.53 in order to utilise some calibre bug fixes/changes
Change to calibre API for deprecated dialog in 0.8.49 which caused issues that intermittently crashed calibre on Mac OS
Add a "Insert or replace cover" option to attempt to insert or replace a cover without doing a conversion
Add a "Remove cover" option to attempt to completely remove an identified cover from the ePub.
Rewrite "Removed unused image files" and "Remove broken cover images" features to use lxml rather than regex for better accuracy
Add protection for numerous options against trying to apply them to a DRM encrypted book
Better handle ebooks where the ncx file is not in same directory as opf manifest
If user chooses redundant options (e.g. "Remove all jackets" makes "Remove legacy jackets" redundant) do not run the redundant option

Version 1.1.7 - 17 May 2012
Re-release of 1.1.6 to cater for missing file

Version 1.1.6 - 17 May 2012
Bug fix for the last_modified column not being updated if multiple books modified
Add a "Remove broken cover images" option to remove html pages which contain only an image tag to a broken image.
Add a "Remove broken TOC entries in NCX" option to remove ncx entries that point to non-existent html pages
Fix for remove unused images to include svg and bmp files as possible image extensions

Version 1.1.5 - 09 May 2012
Fix for Remove xpgt files and links to remove the xpgt file from the manifest
When performing any Modify action, update the last_modified column in calibre for the book.

Version 1.1.4 - 07 May 2012
Fix for remove unused images to check encrypted and unencrypted names, skip DRM ebooks
When using the Remove xpgt files and links option, remove trailing whitespace after the removed <link>
When no epubs are modified, ensure the log detail is available to review

Version 1.1.3 - 07 May 2012
Fix for remove unused images to better handle image paths with other characters like commas

Version 1.1.2 - 07 May 2012
Fix for remove unused images to better handle image paths with spaces

Version 1.1.1 - 05 May 2012
Fix for remove unused images to url encode image paths with spaces in them, and handle namespaced images

Version 1.1.0 - 05 May 2012
Move the "Remove margins from Adobe .xpgt files" into a new Adobe section on the UI
Add a "Remove Adobe .xpgt files and links" option for complete clean xpgt file removal
Add a "Remove Adobe resource DRM meta tags" option for stripping DRM <meta> resource identifiers from xhtml content.
Extend "Remove embedded fonts" to also remove @font-face declarations from the CSS and html files
Add a "Remove unused image files" option to remove orphaned images not referenced from the html content to save space.
Add a "Flatten TOC hierarchy in NCX file" option to move all the navPoints to a single level if they are nested.

Version 1.0.2 - 12 Feb 2012
Add ability to smarten punctuation of HTML files

Version 1.0.1 - 23 Nov 2011
When updating metadata, ensure that if calibre has no tags any dc:subject elements are removed
Improve the logging output when removing non dc: metadata elements

Version 1.0.0 - 22 Oct 2011
Preparation for deprecation for db.format_abspath() function in future Calibre for network backends
Merge in remaining CSS/margin changes from Idolse for initial release
Support keyboard shortcut for opening dialog

Version 0.3.5 - 26 Jun 2011
Fix an issue with css margin rewriting that used property names using '_' instead of '-'

Version 0.3.4 - 21 Jun 2011
Fix issue with some NCX files not parsing correctly causing error with OS artifact removal
Remove dependency on the Calibre epub-fix Container class to allow plugin to develop independently
Incorporate ldolse's rewrite CSS margin code to reset page/body margins

Version 0.3.1 - 12 Jun 2011
No longer look in manifest for NCX file, look for physical file instead to get around media-type variant issues
If cancel updating the ePubs, remove the temp directory
Additional mime type for xpgt files as supplied by Idolse

Version 0.3 - 06 Jun 2011
Add ability to remove embedded fonts
Add ability to update the metadata (including cover)
Add an error dialog if the user clicks ok with no options selected
Ensure rebuilding the ePub uses the Calibre zip code as per change to Tweak ePub

Version 0.2.2 - 03 Jun 2011
Treat iTunesArtwork the same as iTunes plist files
Add an option to remove OS artifacts of .DS_Store and thumbs.db files
Ensure that any xml elements inserted in the manifest are "tailed" correctly for indenting
When adding items to manifest, if a .htm* file check for xmlns indicating mimetype of xhtml+xml

Version 0.2.1 - 30 May 2011
Ensure Calibre bookmarks and iTunes files are removed from the manifest if present there

Version 0.2 - 30 May 2011
Add option to remove iTunesArtwork files
Add option to remove non dc: metadata elements
Add option to add/update calibre jackets
Rename Select none to Clear all on dialog

Version 0.1 - 26 May 2011
Initial release of Modify ePub plugin

Attached Thumbnails
Click image for larger version

Name:	Screenshot_1_Options.png
Views:	6486
Size:	34.8 KB
ID:	86532  
Attached Files
File Type: zip Modify (266.8 KB, 62266 views)

Last edited by kiwidude; 07-05-2015 at 02:50 PM. Reason: v1.3.13 Released
kiwidude is offline   Reply With Quote