10-18-2017, 07:41 PM | #1 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
[Plugin] IDErrorCheck
Checks, repairs and reports all id errors in the epub Requirements Plugin Type: Edit MIT Licence(OSI) Minimum Sigil requirement: v0.9.3 or higher Python Requirements: Python 3.4+ (Bundled or External) OS Requirements: Windows, Linux or OSX *** Tested on Windows 7, 8 & 10 only *** Current Version: "0.2.2" Installation * Select Manage Plugins from the Plugins menu. In the dialog box, select either the Bundled Python or the External Python(Python 3.4+ should be installed on your computer to run this plugin externally). * Click Add Plugin and select IDErrorCheck_vXXX.zip. This will load and install the plugin into Sigil, which you can then run by selecting Plugins > Edit > IDErrorCheck Description This plugin was originally written with the sole intention of properly reporting and, if possible, fixing Epubcheck's infamous "colon" id error problems. This plugin now also does the following: * Converts all "name" attributes to "id" attributes in the html files. * Now checks and repairs all invalid id attribute values in the epub's html files. Checks and repairs illegal spaces and illegal first-digit-start errors and also checks and repairs other illegal non-alphanumerics that commonly occur within id attribute values.(v0.1.5) * Also checks and repairs all internal links that contain bad bookmarks associated with the above html id problems.(v0.1.5) * Checks and repairs all book uuid values in the toc.ncx and content.opf. If an illegal book uuid value is found then another unique uuid will be automatically generated to replace it.(v0.1.5) * Now checks and repairs all navPoint id values in the toc.ncx.(v0.1.5) * Checks and logs all id errors occurring in the content.opf manifest or spine wihout fixing them. * Will properly check, flag and identify Epubcheck's "colon" id errors and fix these errors. * At the end of the plugin run, an error dialog will display a simple error list showing all relevant information about each id error including associated file, line number, reason and bad id. Caveat Don't use the "Mend and prettify..." Sigil feature directly after using this plugin. Doing so will change and increase the number of lines in the html files so that any reported error line numbers generated by the plugin automatically become inaccurate and void. Plugin Run First load your epub into Sigil and then just run the plugin. If you only want to know which errors have not been fixed then just run the plugin twice. The first time you run the plugin the display log will show you errors that have been fixed or not fixed. The second time you run the plugin will only show you what has not been fixed. Update: This plugin can now process epubs that contain svg images without giving svg errors in Epubcheck. Change Log: Spoiler:
Last edited by slowsmile; 03-19-2022 at 02:41 AM. |
10-19-2017, 04:54 AM | #2 |
Guru
Posts: 668
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
|
Is this plugin's functionality now all included in your CustomCleanerPlus plugin?
A note: you seem to change IDs beginning with a digit by replacing that digit with an x. Which will probably be fine, but could create duplicate IDs, e.g.: id="1" id="2" both become id="x" I manually corrected IDs by prepending X. There must be a limit to the length of an ID string, so I guess you should check if adding a character would push it over that if you were really being careful. Or just forget the original ID and regen them all. |
Advert | |
|
10-19-2017, 07:26 AM | #3 | |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
@AlanHK...
Quote:
The just-released IDErrorCheck does swap in an 'x' char for first char digit errors only. It also substitutes an underscore in all id values that have illegal spaces. It also regens both book ids in the toc.ncx and content.opf files if they are bad. That's all it fixes. All other illegal id values -- such as those containing illegal non-alphanumeriic chars -- are just reported. ID attribute errors in the content.opf are also not fixed -- just reported -- because of the complex rules and myriad dependencies between ids and hrefs within the content.opf and toc.ncx. Last edited by slowsmile; 10-19-2017 at 08:03 AM. |
|
10-19-2017, 08:19 AM | #4 |
Grand Sorcerer
Posts: 27,628
Karma: 194727102
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I think what he's saying is that replacing any first-digits in an id with an 'x' could possibly result in identical ids in the same html file. Prepending the 'x' (instead of swapping) would at least guarantee that already unique ids would stay that way.
|
10-19-2017, 07:07 PM | #5 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
@DiapDealer...I'll try and put in the suggested change. This change will only apply to fixing the first char digit errors in the epub.
|
Advert | |
|
10-19-2017, 08:36 PM | #6 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
Plugin Update: The plugin has been updated(v0.1.2):
*Changed handling of illegal first char digit id errors. These errors are now fixed by prepending(not substituting) an 'x' char into the id value string. Thanks to AlanHK & DiapDealer. Last edited by slowsmile; 10-19-2017 at 08:51 PM. |
10-25-2017, 07:07 PM | #7 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
Could someone please add this new plugin to the Sigil Plugin Index? Thanks in advance.
Last edited by slowsmile; 10-25-2017 at 07:13 PM. |
10-27-2017, 01:18 PM | #8 |
Sigil Developer
Posts: 7,764
Karma: 5446592
Join Date: Nov 2009
Device: many
|
Just added it.
|
03-13-2018, 09:26 AM | #9 |
Guru
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
Plugin replace id after hash for illegal first-digit-start errors, but incorrect IDs are do not fix.
Sample illegal ID: Code:
<h1 id="123abc">Chapter 1</h1> Code:
<a href="../Text/start.xhtml#123abc">Chapter 1</a> Second is corrected to: Code:
<a href="../Text/start.xhtml#x123abc">Chapter 1</a>
|
03-14-2018, 06:10 AM | #10 | |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
@Becky...It's certainly true what you say. But here's what it says in the release notes:
Quote:
If you want to see the problem that Epubcheck has with describing bad ids then you could try running your test epub(with bad ids) through Epubcheck. Then you will see the problem with Epubcheck's strange error messaging, which always seems to involve phantom colons that aren't there. Last edited by slowsmile; 03-14-2018 at 06:47 AM. |
|
03-14-2018, 06:55 AM | #11 |
Guru
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
Thanks for the clarification.
I also understand "phantom colon", because in most cases this is the id that starts with a number. However ... Where do I get the "proper reasons for any id failure"? In IDErrorCheck Log are only records regarding changes made (in the example epub file it is the toc.xhtml file) Why in log has no records about the start.xhtml file and incorrect IDs? Information about the changes made is valuable, but the file still remains with incorrect identifiers. EpubCheck gives even more results, because not only does it provide: Code:
Error while parsing file 'value of attribute" id "is invalid; must be an XML name without colons'. Code:
Fragment of identifier is not defined. |
03-14-2018, 07:29 AM | #12 |
Grand Sorcerer
Posts: 5,612
Karma: 23187563
Join Date: Dec 2010
Device: Kindle PW2
|
@BeckyEbook: You can avoid this whole issue, if you create epub3 books, because the HTML5 standard allows ids that don't start with a letter.
If that is not an option for you, you can easily identify broken links using the built-in Sigil reports tool (Tools > Reports > Links). Last edited by Doitsu; 03-14-2018 at 07:55 AM. |
03-14-2018, 07:59 AM | #13 |
Guru
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
@Doitsu: This is good information about epub3, but most of the files that go through my hands are still epub2.
The report is not perfect in this situation, because I see the same after validation in epubcheck. It's just a simple replacement, which I can add to Saved Searches: Code:
id="(\d) Code:
id="x\1 |
03-14-2018, 08:07 AM | #14 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
I'm not quite sure what you mean by "start.xhtml". Can you clarify what that file is - i.e. is it the cover file, toc file or a text file?
At the end of its run, the IDErrorCheck plugin should display all the results from the id error check in a final dialog. You also have the option of saving these results to a file if you want. Are you getting this dialog at the end of plugin run ?(see thumbnail below) |
03-14-2018, 08:26 AM | #15 |
Guru
Posts: 704
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[FileType Plugin] YVES Bible Plugin | ClashTheBunny | Plugins | 27 | 01-16-2023 01:25 AM |
Goodread Perception Expander plugin not shown on plugin list (kobo h2o) | www | KOReader | 4 | 09-28-2017 10:34 AM |
Problem with my ScrambleEbook plugin and the Plugin Updater tool | jackie_w | Development | 14 | 01-19-2017 10:49 PM |
Plugin not customizable: Plugin: HTML Output does not need customization | flyingfoxlee | Conversion | 2 | 02-24-2012 02:24 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |