![]() |
#1 |
Connoisseur
![]() Posts: 60
Karma: 46
Join Date: Mar 2017
Device: None
|
"Entity 'nbsp' not defined" error
Hello,
I got this error while importing an html file into Sigil: "Entity 'nbsp' not defined". The error was triggered by this line of code: <p> </p>. It's no big deal to fix the code so it will import without the error, but the way Sigil treated the error seemed kind of goofy, so I did a little investigating. I found that a particular "toolbars" string in sigil.ini was the proximate cause of the error. Will someone kindly look at what I did and try to reproduce it? I think most people will not, but please bear with me. Environment: Sigil 0.9.12 Windows 10 Settings: Create New or Empty Epubs as: Version 3 Mend XHTML Source Code On: Open and Save Preserve Entities: 1. Run Sigil. 2. In the Code View pane you should see: <?xml version="1.0" encoding="utf-8"?> <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops"> <head> <title></title> </head> <body> <p>*</p> </body> </html> So far, so good. 3. Toggle to book view, click on the book view pane [IMPORTANT!] and the resulting error is: This page contains the following errors: error on line 6 at column 10: Entity 'nbsp' not defined 4. The error message disappears when I toggle back to Code View, and reappears when I repeat step 3 above. I am guessing that this is normal behavior. 5. Mending the code doesn't make the error go away and I can save to an epub with no warning about nonconforming XHTML. 6. Next I deleted sigil.ini, ran Sigil, and set the prefs as above. No more errors. 7. I can reproduce the error by pasting the toolbars string from the original sigil.ini file into the new sigil.ini file. Here is the suspect string: toolbars=@ByteArray(\0\0\0\xff\0\0\0\0\xfd\0\0\0\x 3\0\0\0\0\0\0\x1%\0\0\x2\x45\xfc\x2\0\0\0\x2\xfb\0 \0\0\x16\0\x62\0o\0o\0k\0\x62\0r\0o\0w\0s\0\x65\0r \x1\0\0\0W\0\0\x2\x45\0\0\0]\0\xff\xff\xff\xfb\0\0\0\x16\0\x63\0l\0i\0p\0s\0w\ 0i\0n\0\x64\0o\0w\0\0\0\0\0\xff\xff\xff\xff\0\0\0]\0\xff\xff\xff\0\0\0\x1\0\0\x1Q\0\0\x2\x45\xfc\x2\ 0\0\0\x1\xfc\0\0\0W\0\0\x2\x45\0\0\0r\x1\0\0\x14\x fa\0\0\0\x1\x1\0\0\0\x2\xfb\0\0\0\x1e\0t\0\x61\0\x 62\0l\0\x65\0o\0\x66\0\x63\0o\0n\0t\0\x65\0n\0t\0s \x1\0\0\0\0\xff\xff\xff\xff\0\0\0P\0\xff\xff\xff\x fb\0\0\0\x1a\0p\0r\0\x65\0v\0i\0\x65\0w\0w\0i\0n\0 \x64\0o\0w\x1\0\0\x2_\0\0\x1R\0\0\0P\0\xff\xff\xff \0\0\0\x3\0\0\x4\x85\0\0\0\xf3\xfc\x1\0\0\0\x1\xfb \0\0\0*\0v\0\x61\0l\0i\0\x64\0\x61\0t\0i\0o\0n\0r\ 0\x65\0s\0u\0l\0t\0s\0n\0\x61\0m\0\x65\0\0\0\0\0\0 \0\x4\x85\0\0\0P\0\xff\xff\xff\0\0\x1j\0\0\x2\x45\ 0\0\0\x4\0\0\0\x4\0\0\0\b\0\0\0\b\xfc\0\0\0\x3\0\0 \0\x2\0\0\0\a\0\0\0$\0t\0o\0o\0l\0\x42\0\x61\0r\0\ x46\0i\0l\0\x65\0\x41\0\x63\0t\0i\0o\0n\0s\x1\0\0\ 0\0\xff\xff\xff\xff\0\0\0\0\0\0\0\0\0\0\0 \0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0M\0\x6 1\0n\0i\0p\x1\0\0\0\x84\xff\xff\xff\xff\0\0\0\0\0\ 0\0\0\0\0\0\x18\0t\0o\0o\0l\0\x42\0\x61\0r\0V\0i\0 \x65\0w\0s\x1\0\0\x1P\xff\xff\xff\xff\0\0\0\0\0\0\ 0\0\0\0\0\"\0t\0o\0o\0l\0\x42\0\x61\0r\0I\0n\0s\0\ x65\0r\0t\0i\0o\0n\0s\x1\0\0\x1\x98\xff\xff\xff\xf f\0\0\0\0\0\0\0\0\0\0\0\x16\0t\0o\0o\0l\0\x42\0\x6 1\0r\0\x42\0\x61\0\x63\0k\x1\0\0\x2@\xff\xff\xff\x ff\0\0\0\0\0\0\0\0\0\0\0\x1a\0t\0o\0o\0l\0\x42\0\x 61\0r\0\x44\0o\0n\0\x61\0t\0\x65\x1\0\0\x2j\xff\xf f\xff\xff\0\0\0\0\0\0\0\0\0\0\0\x18\0t\0o\0o\0l\0\ x42\0\x61\0r\0T\0o\0o\0l\0s\x1\0\0\x2\x94\xff\xff\ xff\xff\0\0\0\0\0\0\0\0\0\0\0\x2\0\0\0\a\0\0\0\x1e \0t\0o\0o\0l\0\x42\0\x61\0r\0H\0\x65\0\x61\0\x64\0 i\0n\0g\0s\x1\0\0\0\0\xff\xff\xff\xff\0\0\0\0\0\0\ 0\0\0\0\0$\0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\ 0t\0\x46\0o\0r\0m\0\x61\0t\0s\x1\0\0\0\xde\xff\xff \xff\xff\0\0\0\0\0\0\0\0\0\0\0 \0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0\x41\0 l\0i\0g\0n\x1\0\0\x1\x9e\xff\xff\xff\xff\0\0\0\0\0 \0\0\0\0\0\0\x18\0t\0o\0o\0l\0\x42\0\x61\0r\0L\0i\ 0s\0t\0s\x1\0\0\x2\"\xff\xff\xff\xff\0\0\0\0\0\0\0 \0\0\0\0\x1c\0t\0o\0o\0l\0\x42\0\x61\0r\0I\0n\0\x6 4\0\x65\0n\0t\0s\x1\0\0\x2j\xff\xff\xff\xff\0\0\0\ 0\0\0\0\0\0\0\0\"\0t\0o\0o\0l\0\x42\0\x61\0r\0\x43 \0h\0\x61\0n\0g\0\x65\0\x43\0\x61\0s\0\x65\x1\0\0\ x2\xb2\xff\xff\xff\xff\0\0\0\0\0\0\0\0\0\0\0(\0t\0 o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0\x44\0i\0r\ 0\x65\0\x63\0t\0i\0o\0n\0\0\0\0\0\xff\xff\xff\xff\ 0\0\0\0\0\0\0\0\0\0\0\x2\0\0\0\x1\0\0\0\x18\0t\0o\ 0o\0l\0\x42\0\x61\0r\0\x43\0l\0i\0p\0s\0\0\0\0\0\x ff\xff\xff\xff\0\0\0\0\0\0\0\0) 8. I could also eliminate the error by keeping the original sigil.ini file in place and deleting from Preserve Entities. This probably worked because the toolbars string was modified in the process. Thanks for reading. - Mark |
![]() |
![]() |
![]() |
#2 |
Connoisseur
![]() Posts: 60
Karma: 46
Join Date: Mar 2017
Device: None
|
Yeah, never mind. It's probably related to the UTF-8 issues with 0.9.12.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Also,
Code:
Code:
&160 ; In EPUB2, you were able to use the more human-readable entity names. But in EPUB3, you have to use the hex version only. |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,340
Karma: 203719142
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
No it's not related to the utf-8 issue. It's related to named entities not being valid in EPUB3. If you're going to be primarily working with EPUB3, then make sure your Preserve Entities list is comprised only of numeric entities.
|
![]() |
![]() |
![]() |
#5 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,512
Karma: 167912829
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Epub3/HTML5 defines 5 character entities. If you use other character entities, they are flagged as errors though they will work in most cases.
The 5 permitted entities with added spaces are: & quot ; & apos ; & amp ; & lt ; & gt ; So say goodbye to the human readable nbsp and say hello to #160 or #xa0. Last edited by DNSB; 03-16-2019 at 08:58 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,438
Karma: 5702578
Join Date: Nov 2009
Device: many
|
I know it sounds silly, but html and xml allowed people to craft their own entity definitions and people actually crafted recursive entities that were used to attack websites and browsers. There is actually a lot of code to prevent evilly crafted named entities. The move to just numeric entities has made validating and expanding entities much easier and safer and help to restrict attack vectors.
|
![]() |
![]() |
![]() |
#7 | |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,295
Karma: 20171067
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Quote:
OBTW - Sigil probably already does this...can you auto switch from numbers to readable and then back when you are done editing? |
|
![]() |
![]() |
![]() |
#8 |
Connoisseur
![]() Posts: 60
Karma: 46
Join Date: Mar 2017
Device: None
|
I understand the issue with named character entities, but I don't see this behavior with 0.9.10 set to preserve and with my particular toolbar string. It only occurs with 0.9.12.
For this test, note that I am not importing a file. The stub epub3 that sigil creates doesn't have an , so I shouldn't see: error on line 6 at column 10: Entity 'nbsp' not defined'. I don't get that message under 0.9.10 with the same settings. Under 0.9.12, this goes away if I wipe the init file, even if I again set Create New or Empty Epubs as: Version 3 and Preserve Entities: . Mostly I am curious as to why this particular string causes sigil 0.9.12 to behave differently than with 0.9.10. |
![]() |
![]() |
![]() |
#9 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,438
Karma: 5702578
Join Date: Nov 2009
Device: many
|
A guess... differences in mend on open or save settings? Sigil does treat the nbsp entity in a different manner from any other entities. Sigil gumbo parser used by mend will automatically put the nbsp named entity into the page if a real nbsp char is found on epub2. In epub3, gumbo/mend will always put a nbsp numeric entity if the nbsp char is found.
This is to prevent the nbsp characters from being converted to normal spaces when being processed in a QPlainTextEdit widget which happens due to a "design choice" in Qt (I would call it a huge bug but). That said, I am still not sure about how that image you postedcomes about. Perhaps your preserve entities are messed up in the sigil.ini file, or ? What happens if you right click on that page and run Mend? When the Preview auto updates, does it still have the same error message? |
![]() |
![]() |
![]() |
#10 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,512
Karma: 167912829
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
What I found interesting is that the & nbsp ; is on line 9 but the error in the image refers to line 6 which is <title></title>.
Last edited by DNSB; 03-19-2019 at 12:02 AM. |
![]() |
![]() |
![]() |
#11 |
Connoisseur
![]() Posts: 60
Karma: 46
Join Date: Mar 2017
Device: None
|
Same settings for all my tests. The Sigil version and the possibly-bogus toolbars string are the only variables. If I remove and restore the Preserve nbsp setting, the problem goes away, but the op also changes the string. Mending has no effect as far as I can tell (same message). Maybe it is a 64-bit Windows thing.
Attached is the sigil.ini file associated with my error. |
![]() |
![]() |
![]() |
#12 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,340
Karma: 203719142
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
EDIT: Nevermind I think I see what you're saying now. As far as your ini file goes ... there's no Preserve Entities settings saved to it yet, anyway. Just stop using Sigil 0.9.11/12. It's not healthy. Generate completely new ini files with Sigil 0.9.10 and wait for the next release. Continuing to use the problem versions of Sigil just isn't worth the trouble. Last edited by DiapDealer; 03-19-2019 at 09:13 AM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
"Error while parsing file 'element "img" not allowed here" | shotsfromthebar | Editor | 3 | 03-31-2017 03:53 AM |
question about "user defined usb device" plugin | Kaverin | Plugins | 2 | 03-08-2014 09:59 PM |
"ELF binary type "0" not known" Error. When run Kindlegen | chovan | Amazon Kindle | 4 | 02-09-2012 11:49 AM |
Can't get rid of   "paragraphs" when converting | Y|yukichigai | Conversion | 3 | 01-23-2012 11:20 PM |
Touch "Updating Reading Life" = "Network error" | m_bisson | Kobo Reader | 5 | 07-15-2011 01:05 AM |