Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 10-18-2017, 07:41 PM   #1
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
[Plugin] IDErrorCheck

Checks, repairs and reports all id errors in the epub

Requirements
Plugin Type: Edit
MIT Licence(OSI)
Minimum Sigil requirement: v0.9.3 or higher
Python Requirements: Python 3.4+ (Bundled or External)
OS Requirements: Windows, Linux or OSX
*** Tested on Windows 7, 8 & 10 only ***
Current Version: "0.2.2"

Installation
* Select Manage Plugins from the Plugins menu. In the dialog box, select either the Bundled Python or the External Python(Python 3.4+ should be installed on your computer to run this plugin externally).

* Click Add Plugin and select IDErrorCheck_vXXX.zip. This will load and install the plugin into Sigil, which you can then run by selecting Plugins > Edit > IDErrorCheck

Description
This plugin was originally written with the sole intention of properly reporting and, if possible, fixing Epubcheck's infamous "colon" id error problems. This plugin now also does the following:

* Converts all "name" attributes to "id" attributes in the html files.

* Now checks and repairs all invalid id attribute values in the epub's html files. Checks and repairs illegal spaces and illegal first-digit-start errors and also checks and repairs other illegal non-alphanumerics that commonly occur within id attribute values.(v0.1.5)

* Also checks and repairs all internal links that contain bad bookmarks associated with the above html id problems.(v0.1.5)

* Checks and repairs all book uuid values in the toc.ncx and content.opf. If an illegal book uuid value is found then another unique uuid will be automatically generated to replace it.(v0.1.5)

* Now checks and repairs all navPoint id values in the toc.ncx.(v0.1.5)

* Checks and logs all id errors occurring in the content.opf manifest or spine wihout fixing them.

* Will properly check, flag and identify Epubcheck's "colon" id errors and fix these errors.

* At the end of the plugin run, an error dialog will display a simple error list showing all relevant information about each id error including associated file, line number, reason and bad id.

Caveat
Don't use the "Mend and prettify..." Sigil feature directly after using this plugin. Doing so will change and increase the number of lines in the html files so that any reported error line numbers generated by the plugin automatically become inaccurate and void.

Plugin Run
First load your epub into Sigil and then just run the plugin. If you only want to know which errors have not been fixed then just run the plugin twice. The first time you run the plugin the display log will show you errors that have been fixed or not fixed. The second time you run the plugin will only show you what has not been fixed.

Update: This plugin can now process epubs that contain svg images without giving svg errors in Epubcheck.

Change Log:

Spoiler:

v0.2.2-- Fixed a problem with the NCX ID check.
v0.2.1-- Fixed a bug where the toc.ncx file name was hard coded in the plugin causing problems. Thanks to Thasaidon.
v0.2.0-- Fixed a bug and removed some unnecessary code in the checkOPFID() function. Thanks to Lucsart.
v0.1.9-- Fixed a bug where html "name" attributes were not initially being converted to "id" attributes before the id error checks. Thanks to Thasaidon.
v0.1.8-- Fixed a problem causing svg formatting errors in Epubcheck. SVG images can now be used in epubs without problems when using this plugin.
v0.1.7-- Now removes the 'name="calibre:cover"' line in the cover file meta tags which was causing Epubcheck problems. Thanks to Becky.
-- Plugin now does not check or change any "name" attributes or their values in the meta tags of all xhtml files. Thanks to DiapDealer.
v0.1.6-- The plugin now prepends an 'x' for all illegal numeric first char problems in ids(ie as it was before the last change). Thanks to Becky.
v0.1.5-- Plugin now repairs all illegal non-alphanum characters within ids and href ids in the xhtml files and toc.ncx only.
v0.1.4-- Plugin now check ids in all tags in the xhtml files
-- Plugin now removes problematic and superfluous ids from navpoint hrefs
-- Plugin now removes probematic and superfluous ids from guide hrefs
-- Thanks to Becky for identifying these problems.
v0.1.3-- Changed epub error message from "Invalid Epub" to "Epub contains no data". Thanks to Doitsu.
v0.1.2-- Changed handling of illegal first char digit id errors. These errors are now fixed by prepending(not substituting) an 'x' char into the id value string. Thanks to AlanHK & DiapDealer
v0.1.1-- Fixed plugin exit problem. Thanks to AlanHK
-- Tentative fix for Linux OS identification problem(untested). Thanks to Doitsu.
v0.1.0-- Initial release
Attached Thumbnails
Click image for larger version

Name:	IDErrorCheck_Log.JPG
Views:	900
Size:	35.4 KB
ID:	163060  
Attached Files
File Type: zip IDErrorCheck_v022.zip (59.1 KB, 684 views)

Last edited by slowsmile; 03-19-2022 at 02:41 AM.
slowsmile is offline   Reply With Quote
Old 10-19-2017, 04:54 AM   #2
AlanHK
Guru
AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.
 
AlanHK's Avatar
 
Posts: 668
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
Is this plugin's functionality now all included in your CustomCleanerPlus plugin?

A note: you seem to change IDs beginning with a digit by replacing that digit with an x.
Which will probably be fine, but could create duplicate IDs, e.g.:

id="1" id="2"
both become id="x"

I manually corrected IDs by prepending X. There must be a limit to the length of an ID string, so I guess you should check if adding a character would push it over that if you were really being careful.
Or just forget the original ID and regen them all.
AlanHK is offline   Reply With Quote
Advert
Old 10-19-2017, 07:26 AM   #3
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
@AlanHK...

Quote:
Is this plugin's functionality now all included in your CustomCleanerPlus plugin?
No this code hasn't been added to the CustomerCleanerPlus plugin(CCP). The reason for this is because CCP is a cleaner for html files and epubs, which has nothing really to do with checking or fixing ids.

The just-released IDErrorCheck does swap in an 'x' char for first char digit errors only. It also substitutes an underscore in all id values that have illegal spaces. It also regens both book ids in the toc.ncx and content.opf files if they are bad. That's all it fixes. All other illegal id values -- such as those containing illegal non-alphanumeriic chars -- are just reported. ID attribute errors in the content.opf are also not fixed -- just reported -- because of the complex rules and myriad dependencies between ids and hrefs within the content.opf and toc.ncx.

Last edited by slowsmile; 10-19-2017 at 08:03 AM.
slowsmile is offline   Reply With Quote
Old 10-19-2017, 08:19 AM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I think what he's saying is that replacing any first-digits in an id with an 'x' could possibly result in identical ids in the same html file. Prepending the 'x' (instead of swapping) would at least guarantee that already unique ids would stay that way.
DiapDealer is offline   Reply With Quote
Old 10-19-2017, 07:07 PM   #5
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
@DiapDealer...I'll try and put in the suggested change. This change will only apply to fixing the first char digit errors in the epub.
slowsmile is offline   Reply With Quote
Advert
Old 10-19-2017, 08:36 PM   #6
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
Plugin Update: The plugin has been updated(v0.1.2):

*Changed handling of illegal first char digit id errors. These errors are now fixed by prepending(not substituting) an 'x' char into the id value string. Thanks to AlanHK & DiapDealer.

Last edited by slowsmile; 10-19-2017 at 08:51 PM.
slowsmile is offline   Reply With Quote
Old 10-25-2017, 07:07 PM   #7
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
Could someone please add this new plugin to the Sigil Plugin Index? Thanks in advance.

Last edited by slowsmile; 10-25-2017 at 07:13 PM.
slowsmile is offline   Reply With Quote
Old 10-27-2017, 01:18 PM   #8
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,633
Karma: 5433388
Join Date: Nov 2009
Device: many
Just added it.
KevinH is online now   Reply With Quote
Old 03-13-2018, 09:26 AM   #9
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 690
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Plugin replace id after hash for illegal first-digit-start errors, but incorrect IDs are do not fix.

Sample illegal ID:
Code:
<h1 id="123abc">Chapter 1</h1>
Sample link to illegal ID:
Code:
<a href="../Text/start.xhtml#123abc">Chapter 1</a>
First sample is not corrected.

Second is corrected to:
Code:
<a href="../Text/start.xhtml#x123abc">Chapter 1</a>
Attached Files
File Type: epub test-id.epub (12.5 KB, 426 views)
BeckyEbook is online now   Reply With Quote
Old 03-14-2018, 06:10 AM   #10
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
@Becky...It's certainly true what you say. But here's what it says in the release notes:

Quote:
* Checks and, if possible, repairs all invalid id attribute values in the epub's html files.

* Also checks and, if possible, repairs internal links that contain bad bookmarks associated with the above html id problems.

* Checks and, if possible, repairs all navPoint id values in the toc.ncx.
The above means that it will not fix every single id problem. I saw no point in fixing all id problems because giving you the line number and the reason for the id fail should really be enough for you to fix the id problem. And the main reason that I wrote this app was because Epubcheck did not describe id problems very well. This plugin was really just an attempt to give proper reasons for any id failure as well as point the user accurately to the problem line in the epub.

If you want to see the problem that Epubcheck has with describing bad ids then you could try running your test epub(with bad ids) through Epubcheck. Then you will see the problem with Epubcheck's strange error messaging, which always seems to involve phantom colons that aren't there.

Last edited by slowsmile; 03-14-2018 at 06:47 AM.
slowsmile is offline   Reply With Quote
Old 03-14-2018, 06:55 AM   #11
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 690
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Thanks for the clarification.
I also understand "phantom colon", because in most cases this is the id that starts with a number.

However ... Where do I get the "proper reasons for any id failure"?
In IDErrorCheck Log are only records regarding changes made (in the example epub file it is the toc.xhtml file)

Why in log has no records about the start.xhtml file and incorrect IDs?

Information about the changes made is valuable, but the file still remains with incorrect identifiers.

EpubCheck gives even more results, because not only does it provide:
Code:
Error while parsing file 'value of attribute" id "is invalid; must be an XML name without colons'.
additionally, there is an incompatibility of references to id with "x" and original id (without "x"):
Code:
Fragment of identifier is not defined.
BeckyEbook is online now   Reply With Quote
Old 03-14-2018, 07:29 AM   #12
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
@BeckyEbook: You can avoid this whole issue, if you create epub3 books, because the HTML5 standard allows ids that don't start with a letter.

If that is not an option for you, you can easily identify broken links using the built-in Sigil reports tool (Tools > Reports > Links).

Last edited by Doitsu; 03-14-2018 at 07:55 AM.
Doitsu is offline   Reply With Quote
Old 03-14-2018, 07:59 AM   #13
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 690
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
@Doitsu: This is good information about epub3, but most of the files that go through my hands are still epub2.

The report is not perfect in this situation, because I see the same after validation in epubcheck.

It's just a simple replacement, which I can add to Saved Searches:

Code:
id="(\d)
to:
Code:
id="x\1
BeckyEbook is online now   Reply With Quote
Old 03-14-2018, 08:07 AM   #14
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
I'm not quite sure what you mean by "start.xhtml". Can you clarify what that file is - i.e. is it the cover file, toc file or a text file?

At the end of its run, the IDErrorCheck plugin should display all the results from the id error check in a final dialog. You also have the option of saving these results to a file if you want. Are you getting this dialog at the end of plugin run ?(see thumbnail below)
Attached Thumbnails
Click image for larger version

Name:	IDErrorCheck_Results_Dialog.JPG
Views:	430
Size:	29.5 KB
ID:	162797  
slowsmile is offline   Reply With Quote
Old 03-14-2018, 08:26 AM   #15
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 690
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Quote:
Originally Posted by slowsmile View Post
I'm not quite sure what you mean by "start.xhtml". Can you clarify what that file is - i.e. is it the cover file, toc file or a text file?
Start.xhtml is text file from sample epub file attached to my first post.



In log are only replaces in toc.xhtml file (after hashes).
BeckyEbook is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[FileType Plugin] YVES Bible Plugin ClashTheBunny Plugins 27 01-16-2023 01:25 AM
Goodread Perception Expander plugin not shown on plugin list (kobo h2o) www KOReader 4 09-28-2017 10:34 AM
Problem with my ScrambleEbook plugin and the Plugin Updater tool jackie_w Development 14 01-19-2017 10:49 PM
Plugin not customizable: Plugin: HTML Output does not need customization flyingfoxlee Conversion 2 02-24-2012 02:24 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 10:19 AM.


MobileRead.com is a privately owned, operated and funded community.