![]() |
#1 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,005
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Not Well formed 1.4.3
My system is Windows 10 Home 64-bit. I have the 64-bit version of Sigil 1..3 installed along with the epubcheck plugin.
I was loading an ePub 3 eBook and Sigil pops up the message... Quote:
Is this a bug with Sigil? If so, can it be fixed? If it's not a bug, what in the eBook code is incorrect? Here is a scrambled copy of the eBook. The only changes made is that the embedded fonts were removed and all CSS code referencing these fonts was removed. Otherwise, it's the unchanged scrambled code. I did check it with epubcheck and it passed. |
|
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,443
Karma: 5703082
Join Date: Nov 2009
Device: many
|
No, If you read the error message it tells you that it is missing its DOCTYPE (assuming it has html, head and body tags), which is required by the epub spec. It is an open issue on epubcheck to test and report this. Calibre does not follow this aspect of the spec. Sigil does and has for years prior to a couple of releases ago when auto mending to move things to its standard layout always fixed it. Now that we no longer move things to standard locations, the auto fixing is no longer done.
|
![]() |
![]() |
![]() |
#3 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,616
Karma: 168431739
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Mend and prettify or just Mend will add the missing doctypes. The CSSUndefinedClasses plugin is not happy with running against an epub with those errors so I've been using Mend to fix the issue.
Looking at your scrambled epub, the first block is before mend and prettify, the second is after. Code:
<?xml version='1.0' encoding='utf-8'?> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en-us" xml:lang="en-us"> <head> Code:
<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" lang="en-us" xml:lang="en-us"> <head> Last edited by DNSB; 01-07-2021 at 10:53 PM. |
![]() |
![]() |
![]() |
#4 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
|
I get that warning message on pretty much every new book I open in Sigil.
I am puzzled as to why Sigil says the book is not well formed because I can open a book (that's not been opened in Sigil) in Freda, ADE, on my Kobo, on my Ipad and (shudder) Calibre and none of them complain that the book is malformed. From my experience it's only Sigil that brings up the warning about the missing DOCTYPE tag. Also, if it is so important, why is that just about every book I come across is missing it? Even new releases appear not to have it so I can only assume it's not that critical. |
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,341
Karma: 203719646
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Funny. I get the warning on pretty much none of the epubs I open.
Being able to open a book in an ereading program with no warning has never been an indicator of whether the epub in question was spec-compliant or not. Either start hitting yes to the warning (and subsequently saving the epub after) or get used to seeing the warning. Those are your options. Last edited by DiapDealer; 01-08-2021 at 06:24 AM. |
![]() |
![]() |
![]() |
#6 | |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
|
Quote:
|
|
![]() |
![]() |
![]() |
#7 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,443
Karma: 5703082
Join Date: Nov 2009
Device: many
|
Sigil is making the most spec compliant and consistent epub it can. According to the spec, DOCTYPE is required and older versions of Sigil quietly fixed this on load as it had to move things to fit Sigil's standard form. Newer versions of Sigil no longer auto fix the missing DOCTYPE (but Mend will properly fix it) so it warns the user to fix things and offers to auto fix for them.
Those same e-readers will work just with the DOCTYPE. As I said, epubcheck has an open issue to fix this. BTW, any epub2 that has and uses any named entities (ie like nbsp) in it that is missing the DOCTYPE is technically broken and will not work on most e-readers because epub2's version of the DOCTYPE is where the named entities are included. That is why this is important to fix. Calibre is not spec compliant on this issue but does replace all named entities with their numeric or character equivalents, which makes not having a DOCTYPE even on epub2 possible but technically against the rules. |
![]() |
![]() |
![]() |
#8 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 500
Karma: 3498633
Join Date: May 2011
Location: Surrey, UK
Device: Kobo Aura One, Sony PRS 600/650
|
Thanks for the explanation, Kevin.
As I said, I was just curious why it seemed that it was only Sigil that picked up on the missing DOCTYPE. |
![]() |
![]() |
![]() |
#9 |
just an egg
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,791
Karma: 6758980
Join Date: Mar 2015
Device: Kindle, iOS
|
Is there a way to tell what changes will be made if you agree to the changes Mend wants to make? i.e., I often find there are Sigil features I'm not aware of and learn about reading these forums, so just checking if I'm missing something that's already there.
I will also admit that the part of the warning that says "Sigil can automatically fix these files, although this may result in minor data loss in extreme circumstances [emphasis added]" always gives me pause, making me want to know exactly what changes are being proposed. Thank you |
![]() |
![]() |
![]() |
#10 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 808
Karma: 2416112
Join Date: Jan 2017
Location: Poland
Device: Various
|
Quote:
If you want to see what exactly Sigil changes during these changes: 1. Open the EPUB file 2. If you see the message, choose [No] 3. Save the checkpoint (Checkpoints > Create Checkpoint for Epub or [🡅] icon) 4. Close the EPUB file (without saving!) 5. Open the same EPUB file again 6. Select [Yes] when you see the message 7. Check what has changed (Checkpoints > Compare Epub against Checkpoint or [±] icon) |
|
![]() |
![]() |
![]() |
#11 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,443
Karma: 5703082
Join Date: Nov 2009
Device: many
|
That message was mostly leftover from the old days when Sigil used HTML Tidy and it would occasionally mess up.
Modern versions of Sigil use the gumbo parser that autorepairs following the exact same rules as major browsers like Safari, Edge, Chrome, Firefox. Sigil has git checkpointing built in. So before making any change simply run Checkpoint so that you can see the diffs of what changed and even revert to an earlier checkpoint if you so desire. |
![]() |
![]() |
![]() |
#12 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,443
Karma: 5703082
Join Date: Nov 2009
Device: many
|
FWIW, any version of Sigil in the 0.9.x range long used gumbo Mend to silently fix things like missing doctypes when moving files to old "standard" layouts. Gumbo did that literally for years with no problems. Running gumbo (Mend) is very safe in general.
|
![]() |
![]() |
![]() |
#13 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,341
Karma: 203719646
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#14 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,005
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Windows 10 Home 64-bit and Sigi 1.4.3 64-bit
I follow the directions just posted. When I click the icons to compare ePub against the checkpoint, i get Diff Failed: No checkpoints found. But when I go to Manage Checkpoint Repositories, I see the checkpoint I created. Am I doing anything wrong? |
![]() |
![]() |
![]() |
#15 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,443
Karma: 5703082
Join Date: Nov 2009
Device: many
|
The uuid of the opf is used as the checkpoint repo identifier and one is created if none exists but if you did not save after the checkpoint, the next time you load that epub, yet another new uuid will be created and no match will be found.
So load your epub, do not allow mend. Do the checkpoint. That will add a uuid dc identifier automatically. Save that file to a new name (to prevent confusion). Now you can either run Mend now and thendo the compare against the checkpoint or load the newly saved file and then run mend, and compare this to its checkpoint. This is only as issue for epubs that do not have any uuid dc:identifier to begin with. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Error XML not well formed - Please help! | beellis | Sigil | 1 | 05-25-2017 09:19 AM |
well formed - or not? | roger64 | Sigil | 12 | 10-29-2016 02:54 PM |
iFrame not well-formed so doesn't pass ePubcheck | ChuckH | ePub | 6 | 11-19-2015 12:01 PM |
when SVG is not well-formed | brolny | Sigil | 3 | 11-12-2015 05:43 PM |
Error: Cannot split: ......xhtml XML is not well formed | Alt68er | Sigil | 2 | 04-23-2014 03:00 AM |