Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
[GUI Plugin] Quality Check
This plugin is intended to help you identify covers or book metadata that may need editing in your library that the normal calibre search criteria cannot identify, or is more conveniently accessible. It also offers a battery of ePub quality checks to help you identify ePubs that may warrant further attention to remove cruft files, reconvert etc. You can also search across your ePubs for ad hoc criteria to find text or specifically named items using regular expressions.
Amazon Fire users will find the Mobi checks useful to identify books which will not show in the correct folder due to internal metadata issues with the ASIN or cdetype.
For the OCD types, this plugin can be used to find books that need better quality covers based on physical or file size to be sourced to replace those in your library currently.
You may have noticed when editing individual books in the Edit Metadata screen that your book has a red colored background field indicating a potential problem. This plugin allows you to bulk find all such books in your library so that you can resolve any issues such as by using the bulk metadata edit dialog. A separate "Check missing" menu offers a quick way to serch for common missing data without having to remember calibre's search syntax.
Main Features of v1.9.11:
- Find ePub formats that have one, multiple, legacy or no jackets. Legacy jackets are from calibre conversions prior to 0.6.50 which will result in duplicated jackets if you convert again.
- Find ePub formats that are invalid due to a missing container.xml file
- Find ePub formats with invalid namespaces specified in the container or opf manifest
- Find ePub formats that have non dc: namespace elements in the manifest
- Find ePub formats that have files listed in the .opf manifest that are not in the ePub file
- Find ePub formats that have files in the ePub that are not included in the .opf manifest
- Find ePub formats that have CSS files in the ePub manifest that are not used from the html content
- Find ePub formats that have unused image files that can be safely deleted
- Find ePub formats that have iTunes plist or artwork files from viewing the ePub in iTunes
- Find ePub formats that have calibre bookmarks files from the calibre ebook viewer
- Find ePub formats that have OS artifacts like .DS_Store or Thumbs.db
- Find ePub formats that have an NCX TOC that is hierarchical
- Find ePub formats that have an NCX TOC with < 3 entries
- Find ePub formats that have an NCX TOC with broken link entries
- Find ePub formats that have broken <guide> references in the manifest
- Find ePub formats that have html files larger than 260KB requiring splitting
- Find ePub formats that have DRM
- Find ePub formats that have Adobe DRM <meta> tags
- Find ePub formats that have or do not have a cover that is replaceable when exporting from calibre
- Find ePub formats that have or do not have SVG covers created by calibre
- Find ePub formats that have or have not been converted using calibre
- Find ePub formats that have <address> smart tags which corrupt readability
- Find ePub formats that have embedded fonts
- Find ePub formats that have @font-face declarations in CSS or html files
- Find ePub formats that have Adobe .xpgt files with margins specified
- Find ePub formats that have Adobe inline .xpgt links in the html files
- Find ePub formats that have are left or unjustified i.e. no CSS file containing text-align:justify
- Find ePub formats that have book margins different to your calibre defaults
- Find ePub formats that have no book margins defined
- Find ePub formats that have inline margins defined on the content files
- Find ePub formats that have unsmartened punctuation
- Search the contents of ePub formats for ad hoc regular expressions of your own
- Find Mobi/AZW/AZW3 formats that will not show in the documents folder on an Amazon Fire due to cdetype or ASIN missing/incorrect (with a Fix option below).
- Find Mobi/AZW/AZW3 formats that have their Facebook/Twitter sharing disable on an Amazon Fire (with a Fix option below)
- Find Mobi/AZW/AZW3 formats that have publisher imposed limits on the amount of text that can be placed in clipping notes when read on a Kindle.
- Find books that have covers based on a choice of criteria such as file size or dimensions
- Find books with an invalid title sort
- Find books with an invalid author sort
- Find books with an invalid ISBN
- Find books with an invalid pubdate (equal to date timestamp)
- Find books with a duplicate ISBN
- Find books with a duplicate series
- Find books with a series gap
- Find books with a series numbering that does not match the published date
- Find books with more than a specified number of tags
- Find books with comments that have HTML style tags embedded like bold or italic
- Find books with comments that have no HTML tags at all
- Find books with commas in the author (useful if your author is FN LN)
- Find books with no commas in the author (useful if your author is LN, FN)
- Find books with authors that are all upper or all lower case
- Find books with authors that contain non-alphabetic characters like incorrect separators or cruft
- Find books with authors that contain non-ascii characters of accents/diatrics
- Find books with authors that have initials in a different punctuation to your preference
- Find books with titles that have possible series info like hyphens/numerics
- Find books with titles that do not appear to be a valid title case
- Search for books missing metadata, such as titles, authors, isbn, pubdate, publisher, tags, rating, comments, languages, cover or formats.
- Change the search scope to either search your entire library (respecting search restrictions) or only your selected book(s).
- Swap author names for the selected book(s) between FN LN and LN, FN order or vice versa
- Fix author initials for selected books to your preferred naming format
- Rename author names for selected books to replace non-ascii characters
- Find and fix the file sizes stored for a book format (useful if you edit books outside of calibre)
- Find and fix folder/file paths that are missing commas for inconsistencies within a library for books added after calibre 0.8.35
- Cleanup orphaned opf/jpg files/folders resulting from using Save to Disk followed by Remove from device. Do NOT ever run this against a calibre library folder, it is intended for external save to disk/device folders only.
- Fix Mobi/AZW/AZW3 files for a Kindle Fire to insert an ASIN and set the correct cdetype in the internal data to enable sharing and appear in the correct folder
- Ability to add books to an exclusion list for a particular quality check, if you intentionally want to exempt them. You can view and edit your exclusion lists.
- Customise the menus to hide items not of interest
- Requires calibre v0.9.29 or later.
- The default scope of all checks is to your entire library if you have no search restrictions in place, or just to the restricted subset of books if you have applied a restriction. You might find applying a restriction useful if you have a very large library to do cover checks and some of the ePub checks which can be time consuming. Or you can use the "Search scope" submenu to switch to searching selected books only.
- Download the attached zip file and install the plugin/add to context menu or toolbar/restart calibre as described in the Introduction to plugins thread.
- If you find this or any of my other plugins useful please feel free to show your appreciation. I have spent many hundreds of unpaid hours in their development and support so any encouragement for me to continue is appreciated!
Version 1.9.10/11 - 28 Jul 2014
Support for upcoming calibre 2.0
Version 1.9.9 - 05 Jan 2014
Make Series Gaps report case insensitive
Do not include Series with no gaps in the report
When changing configuration preferences, preserve the search scope
Version 1.9.8 - 28 Sep 2013
Fix for the "Check and rename book paths" function with changes to calibre in 1.x.
Version 1.9.7 - 24 Sep 2013
Fix for the Check and repair book sizes function with changes to calibre in 1.x.
Version 1.9.6 - 09 May 2013
Change for correct support of calibre 0.9.29 virtual libraries feature
Version 1.9.5 - 04 Mar 2013
When using the "Fix ASIN for Kindle Fire" feature, check for mobi-asin as a possible identifier prefix
Version 1.9.4 - 15 Feb 2013
Fix for dependency on calibre code removed in 0.9.19
Display how many matches while running, and when cancelling display the matches found at that point rather than aborting completely.
Version 1.9.3 - 17 Jul 2012
Tidy up the help file for some redundant/missing/renamed menu items
For "Check author initials" support more permutations of special cases to include trailing period - i.e. Jr. as well as Jr
In the config screen put a separate option groupbox in for the author initials setting
Add a "Reformat author initials" option to the "Fix" menu to reformat initials to your preference for authors on the selected books
Add a "Rename author to ascii" option to the "Fix" menu to rename author names to remove diatrics and accents
Enhance the "Repeat last action" tooltip to display what that last action was (in status bar)
Version 1.9.2 - 06 Jul 2012
Now requires calibre 0.8.59
Add a "Check authors non alphabetic" option to Check ePub Metadata menu to find authors with invalid separators or other cruft
Add a "Check authors non ascii" option to Check ePub Metadata menu to find authors with accents/diatrics in their names
Add a "Check authors initials" option to Check ePub Metadata menu to find authors with initials that do not match your preferred style.
Add to plugin configuration screen an author initials dropdown for configuring preferred author initials format
Add a "Check smarten punctuation" option to Check ePub Style menu to search for ePubs with " or ' in body
Add a "Show all occurrences" option to "Search ePub" as a slower running alternative to just displaying the first match
Add a "Plain text content" option to "Search ePub" to search html body content for sentences without any tags
In the log for "Search ePub" display the preceding and following 25 characters around the match in the log
Fix for "Search ePub" to not allow user to search if they have no scope checkboxes set
Change the log viewer to a custom implementation that preserves formatting rather than calibre's <pre> approach.
Change to use a new calibre API function for "Check & fix file size" rather than direct database manipulation
Fixes for "Check epub css margins" feature to fix various bugs and updates to match the latest Modify ePub v1.2.8 release
Fix for "Check Adobe inline .xpgt links" to also look for @import style statements
Fix for "Check manifest files missing" to correctly handle nested relative paths
Version 1.9.1 - 23 Jun 2012
Bug fix for check tags count.
Version 1.9.0 - 22 Jun 2012
Now requires calibre 0.8.57
Store configuration in the calibre database rather than a json file, to allow reuse from different computers (not simultaneously!)
Add a support option to the configuration dialog allowing viewing the plugin data stored in the database
Fix bug which intimated it was possible to exclude books from "Check missing" type checks
Rename "Fix MOBI ASIN for Kindle Fire" to "Fix ASIN for Kindle Fire"
Enhance "Fix MOBI ASIN for Kindle Fire" to now handles MOBI, AZW and AZW3 formats, iterating all if multiple for a book
Fix "Fix ASIN for Kindle Fire" so in scenario of not having an ASIN uses calibre's uuid for the book rather than generating a random one
Change all the "Check Mobi" checks to now work with MOBI, AZW or AZW3 formats
Version 1.8.5 - 09 Jun 2012
Fix bug in "Check Adobe inline .xpgt links" due to ordering of attributes dependency
Add a "Check calibre SVG cover" and "Check no calibre SVG cover" options to Check ePub Structure menu
Add a "Zip filenames" scope to the "Search ePub" dialog to allow searching for files of a specific name
Add a "Check Twitter/Facebook disabled" option to Check Mobi menu to look for an equal ASIN in EXTH 113/504
Add a "Fix MOBI ASIN for Kindle Fire" option to the "Fix" menu to set the EXTH 113/504 fields to your asin/amazon identifier, or a random identifier if not present, plus set cdetype to EBOK
Rename the various epub "TOC" checks to include the word "NCX" to reflect that TOC they refer to
Ensure the various NCX TOC checks do not attempt to run on DRM encrypted books
Do not show error message if unable to parse metadata/ncx files due to incorrect encoding
Version 1.8.4 - 20 May 2012
Subtle enhancements to increase performance of a few checks to not scan irrelevant files.
Put an icon next to the Search scope submenu to show what scope is currently selected.
Version 1.8.3 - 17 May 2012
Fix for numerous checks to ensure uppercase file extensions in epub resources are handled correctly
Add a "Check unused CSS files" option to find .css or .xpgt files that are in the manifest but not referenced from html pages
Add a "Check <guide> broken links" option to find entries in the guide section of the manifest file which do not exist
Version 1.8.2 - 16 May 2012
Fix some casing of various menu items for consistency
Fix for "Check unused image files" to include svg and bmp files as a possible image type
Change the handling of exclusions to ensure that ordering of the books is preserved
Add a "Check clipping limit" option to the Check Mobi menu for files that are limited in the % of book that can be clipped on a Kindle
Version 1.8.1 - 14 May 2012
Add "Check replaceable cover" and "Check non-replaceable cover" checks to indicate whether a cover can be replaced when exported/using Modify ePub to update metadata
Remove the "Check inline Calibre cover" and "Check not inline Calibre cover" checks since replaceable cover checks are more useful
Improve the exception messages for invalid epubs to include the reason
Version 1.8.0 - 12 May 2012
Add a "Search ePub" dialog allowing the user to perform a search for any content matching a regular expression
Fix cosmetic issue of marked text incorrect for search results of "Check inline xpgt" links
Add a "Check Mobi" submenu with "Check missing EBOK cdetype" and "Check missing ASIN" options
Add a "Check broken image links" option to the Check ePub Structure menu, to find epubs with html pages that have broken image links
Add a "Check TOC with broken links" option to the Check ePub Structure menu, to find epubs with NCX links to pages that do not exist
Version 1.7.8 - 07 May 2012
Add a "Search scope" submenu to allow searching selected books rather than all books.
Change Check unused image files to look for both encoded and non encoded versions of the image names and ignore DRM ebooks
Version 1.7.7 - 07 May 2012
Yet another tweak to Check unused image files to cope with characters like commas
Version 1.7.6 - 07 May 2012
Properly fix the spaces encoding when using Check unused image files
Version 1.7.5 - 05 May 2012
Ensure a better error is displayed for books where the EPUB format has been deleted outside of calibre
Fix for Check unused image files to look for spaces in the file name, and handle namespaced images
Version 1.7.4 - 04 May 2012
Protect against a blank author field caused by a bug in the Manage Authors dialog in calibre
Split the Check ePub submenu into two submenus for ease of usage and future expansion
Add a "Check unused image files" option to the Check ePub Structure menu, to find books with jpeg/png image files that are not referenced and can be removed to save space
Add a "Check TOC hierarchical" option to the Check ePub Structure menu, to find books with hierarchical TOC that need flattening for certain devices
Add a "Check Adobe DRM meta tag" option to the Check ePub Structure menu, to find books with html files that contain Adobe <meta /> DRM identifiers
Add a "Check Adobe inline .xpgt links" option to the Check ePub Style menu, to find books with html files that <link /> to a .xpgt file
Add a "Check @font-face" option to the Check ePub Style menu, to find books with CSS or html files that contain @font-face declarations
Add more information into the Help file for newcomers explaining some of the options.
Version 1.7.3 - 09 Apr 2012
Add a "Check series pubdate order" menu option, which will report series where the published date is not in order with the series.
Rename "Check titlecase" to "Check title for titlecase"
Add a "Check authors for case" option to look for author names that are all uppercase or lowercase.
Add a "Check missing" submenu, with options to perform searches for books missing title, authors, isbn, pubdate, publisher, tags, rating, comments, languages, cover, formats
Version 1.7.2 - 13 Feb 2012
Add a "Check and rename book paths" menu option to the Fix submenu, for consolidating author paths after commas change made in calibre 0.8.35.
Fix icons broken from 1.7.1 change on new Fix menu
Version 1.7.1 - 04 Feb 2012
Add a "Fix..." menu moving the Check & fix file sizes and Cleanup OPF folders options into.
Add a "Swap author FN LN <-> LN,FN" menu item for working with selected book(s)
Change the "Check authors with commas" check so requires only one of a multi-author book to have commas
Version 1.7.0 - 03 Dec 2011
Move all the metadata based checks under a submenu, and reorder menus
Add a "Repeat last check" menu item to allow simply running the last check again
Add "Exclude from check..." menu item, to allow excluding books from repeatedly showing you want exempt
Add "View exclusions..." menu item, to allow viewing all excluded books per check and deleting if desired.
Fix missing images from menu for CSS based items
Remove configuration of the Author commas/no commas and titles with series menu items
Version 1.6.4 - 25 Nov 2011
When performing a title sort check, take the first ebook language into account if any
When checking for TOC < 3 items, display the title and authors rather than the path
Version 1.6.3 - 22 Oct 2011
Add ePub checks for various types of body/@page margins (Idolse)
Add ePub check for having <address> smart-tags within their content
For the Series gap check, fix the handling of duplicates
Version 1.6.2 - 19 Sep 2011
Tune the check for series with gaps to exclude series with indexes >= 1000 and handle duplicates
Version 1.6.1 - 17 Sep 2011
Add check for series with gaps (checks missing integer values between 1 and max in series)
Version 1.6.0 - 11 Sep 2011
Upgrade to support the centralised keyboard shortcut management in Calibre
Version 1.5.8 - 13 Jun 2011
Ensure search restrictions are respected after change in 1.5.7
Version 1.5.7 - 12 Jun 2011
Change the ePub zero margin check to only look at manifested APNX files
Fix the way books are searched for so if highlighting is turned on correct subset is searched
Add ePub check for css missing "text-align: justify"
Version 1.5.6 - 08 Jun 2011
No longer look in manifest for NCX file, look for physical file instead to get around media-type variant issues
Version 1.5.5 - 08 Jun 2011
Improve the logging for the ePub TOC check to display when no NCX found
Be more flexible when identifying NCX file to allow for incorrect media type
Version 1.5.4 - 05 Jun 2011
Fix the check for missing manifest files to fix for href names containing # in filenames
Add ePub check for iTunesArtwork to the existing iTunes check
Add ePub check for OS artifacts of .DS_Store and Thumbs.db
Add ePub check for non dc elements in opf manifest (from editing in Sigil or Calibre)
Add ePub check for html files larger than 260KB which may not work on some devices
Increase the amount and formatting quality of the logging for more ePub checks
Version 1.5.3 - 25 May 2011
Add ePub check for invalid namespaces
Add a viewable log for the checks to allow user to see details of errors, missing files etc
Add a Help file describing all of the checks and how to solve them
Version 1.5.2 - 17 May 2011
Prevent ePubs with encrypted fonts from showing as being books with DRM
Rename Abort to Cancel on the progress dialog
Version 1.5.1 - 16 May 2011
Quality checks are run with a progress dialog that can be aborted
Change the ePub check for unmanifested files to exclude iTunes plist and Calibre bookmark files
Add ePub check for embedded fonts in an ePub
Add check for pubdate, testing for equality with date timestamp
Add ePub check for DRM
Version 1.5 - 05 May 2011
Move all the ePub checks onto a Check ePub submenu
Add ePub check for legacy jackets only
Add ePub check for missing container.xml files
Add ePub check for files listed in manifest missing from epub
Add ePub check for unmanifested files
Add ePub check for iTunes plist files
Add ePub check for Calibre bookmarks files
Add ePub check for .xpgt margins
Add ePub check for TOC with < 3 entries
Add ePub check for inline Calibre cover
Add ePub check for no inline Calibre cover
Add ePub check for Calibre conversion
Add ePub check for not Calibre converted
Add check for duplicate series index
Improve having jacket and multiple jacket checks to include legacy jackets
Restructure code internally for ease of future expansion
Version 1.4.1 - 13 Apr 2011
Fix check for missing jacket which would incorrectly handle books with two jackets
Add check for multiple jackets
Version 1.4 - 13 Apr 2011
For very small libraries ensure not divide by zero error from displaying status
If an error retrieving the format path for a book, dump to debug output
For the EPUB jackets check, include numbered jacket files from legacy Calibre conversions
Fix bug in check no html comments which had incorrect logic after 1.3 rewrite
Add ability to customise the menu to allowing hiding menu items not of interest
Version 1.3.5 - 11 Apr 2011
Add check duplicate ISBN option
Version 1.3.4 - 10 Apr 2011
Fix error dialog bug when try to cleanup opf files in own Calibre directory.
Version 1.3.3 - 09 Apr 2011
Support skinning of icons by putting them in a plugin name subfolder of local resources/images
Version 1.3.2 - 06 Apr 2011
Fix int is not iterable error when no matches found in check and update file sizes.
Version 1.3.1 - 05 Apr 2011
Fix bug in Check & fix file sizes from rewrite
Add check title case option
Version 1.3 - 03 Apr 2011
Rewritten for new plugin infrastructure in Calibre 0.7.53
Add check for EPUB with/without jacket
Add cleanup opf folders option to cleanup after save to disk/remove actions leaving orphaned files
Version 1.2 - 28 Mar 2011
Add check & fix file format size
Version 1.1 - 20 Mar 2011
Add check for no html in comments
Add check for authors with/without commas
Add check for titles with series
Version 1.0 - 13 Mar 2011
Initial release of Quality Check plugin
Last edited by kovidgoyal; 08-02-2014 at 11:24 AM.
Reason: v1.9.11 Released