Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-16-2019, 02:27 PM   #1
mrprobert
Member
mrprobert began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Mar 2017
Device: None
"Entity 'nbsp' not defined" error

Hello,

I got this error while importing an html file into Sigil:

"Entity 'nbsp' not defined".

The error was triggered by this line of code: <p>&nbsp;</p>. It's no big deal to fix the code so it will import without the error, but the way Sigil treated the error seemed kind of goofy, so I did a little investigating.

I found that a particular "toolbars" string in sigil.ini was the proximate cause of the error. Will someone kindly look at what I did and try to reproduce it? I think most people will not, but please bear with me.

Environment:
Sigil 0.9.12
Windows 10

Settings:
Create New or Empty Epubs as: Version 3
Mend XHTML Source Code On: Open and Save
Preserve Entities: &nbsp;

1. Run Sigil.

2. In the Code View pane you should see:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html>

<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops">
<head>
<title></title>
</head>
<body>
<p>*</p>
</body>
</html>

So far, so good.

3. Toggle to book view, click on the book view pane [IMPORTANT!] and the resulting error is:

This page contains the following errors:
error on line 6 at column 10: Entity 'nbsp' not defined

4. The error message disappears when I toggle back to Code View, and reappears when I repeat step 3 above. I am guessing that this is normal behavior.

5. Mending the code doesn't make the error go away and I can save to an epub with no warning about nonconforming XHTML.

6. Next I deleted sigil.ini, ran Sigil, and set the prefs as above. No more errors.

7. I can reproduce the error by pasting the toolbars string from the original sigil.ini file into the new sigil.ini file.

Here is the suspect string:

toolbars=@ByteArray(\0\0\0\xff\0\0\0\0\xfd\0\0\0\x 3\0\0\0\0\0\0\x1%\0\0\x2\x45\xfc\x2\0\0\0\x2\xfb\0 \0\0\x16\0\x62\0o\0o\0k\0\x62\0r\0o\0w\0s\0\x65\0r \x1\0\0\0W\0\0\x2\x45\0\0\0]\0\xff\xff\xff\xfb\0\0\0\x16\0\x63\0l\0i\0p\0s\0w\ 0i\0n\0\x64\0o\0w\0\0\0\0\0\xff\xff\xff\xff\0\0\0]\0\xff\xff\xff\0\0\0\x1\0\0\x1Q\0\0\x2\x45\xfc\x2\ 0\0\0\x1\xfc\0\0\0W\0\0\x2\x45\0\0\0r\x1\0\0\x14\x fa\0\0\0\x1\x1\0\0\0\x2\xfb\0\0\0\x1e\0t\0\x61\0\x 62\0l\0\x65\0o\0\x66\0\x63\0o\0n\0t\0\x65\0n\0t\0s \x1\0\0\0\0\xff\xff\xff\xff\0\0\0P\0\xff\xff\xff\x fb\0\0\0\x1a\0p\0r\0\x65\0v\0i\0\x65\0w\0w\0i\0n\0 \x64\0o\0w\x1\0\0\x2_\0\0\x1R\0\0\0P\0\xff\xff\xff \0\0\0\x3\0\0\x4\x85\0\0\0\xf3\xfc\x1\0\0\0\x1\xfb \0\0\0*\0v\0\x61\0l\0i\0\x64\0\x61\0t\0i\0o\0n\0r\ 0\x65\0s\0u\0l\0t\0s\0n\0\x61\0m\0\x65\0\0\0\0\0\0 \0\x4\x85\0\0\0P\0\xff\xff\xff\0\0\x1j\0\0\x2\x45\ 0\0\0\x4\0\0\0\x4\0\0\0\b\0\0\0\b\xfc\0\0\0\x3\0\0 \0\x2\0\0\0\a\0\0\0$\0t\0o\0o\0l\0\x42\0\x61\0r\0\ x46\0i\0l\0\x65\0\x41\0\x63\0t\0i\0o\0n\0s\x1\0\0\ 0\0\xff\xff\xff\xff\0\0\0\0\0\0\0\0\0\0\0 \0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0M\0\x6 1\0n\0i\0p\x1\0\0\0\x84\xff\xff\xff\xff\0\0\0\0\0\ 0\0\0\0\0\0\x18\0t\0o\0o\0l\0\x42\0\x61\0r\0V\0i\0 \x65\0w\0s\x1\0\0\x1P\xff\xff\xff\xff\0\0\0\0\0\0\ 0\0\0\0\0\"\0t\0o\0o\0l\0\x42\0\x61\0r\0I\0n\0s\0\ x65\0r\0t\0i\0o\0n\0s\x1\0\0\x1\x98\xff\xff\xff\xf f\0\0\0\0\0\0\0\0\0\0\0\x16\0t\0o\0o\0l\0\x42\0\x6 1\0r\0\x42\0\x61\0\x63\0k\x1\0\0\x2@\xff\xff\xff\x ff\0\0\0\0\0\0\0\0\0\0\0\x1a\0t\0o\0o\0l\0\x42\0\x 61\0r\0\x44\0o\0n\0\x61\0t\0\x65\x1\0\0\x2j\xff\xf f\xff\xff\0\0\0\0\0\0\0\0\0\0\0\x18\0t\0o\0o\0l\0\ x42\0\x61\0r\0T\0o\0o\0l\0s\x1\0\0\x2\x94\xff\xff\ xff\xff\0\0\0\0\0\0\0\0\0\0\0\x2\0\0\0\a\0\0\0\x1e \0t\0o\0o\0l\0\x42\0\x61\0r\0H\0\x65\0\x61\0\x64\0 i\0n\0g\0s\x1\0\0\0\0\xff\xff\xff\xff\0\0\0\0\0\0\ 0\0\0\0\0$\0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\ 0t\0\x46\0o\0r\0m\0\x61\0t\0s\x1\0\0\0\xde\xff\xff \xff\xff\0\0\0\0\0\0\0\0\0\0\0 \0t\0o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0\x41\0 l\0i\0g\0n\x1\0\0\x1\x9e\xff\xff\xff\xff\0\0\0\0\0 \0\0\0\0\0\0\x18\0t\0o\0o\0l\0\x42\0\x61\0r\0L\0i\ 0s\0t\0s\x1\0\0\x2\"\xff\xff\xff\xff\0\0\0\0\0\0\0 \0\0\0\0\x1c\0t\0o\0o\0l\0\x42\0\x61\0r\0I\0n\0\x6 4\0\x65\0n\0t\0s\x1\0\0\x2j\xff\xff\xff\xff\0\0\0\ 0\0\0\0\0\0\0\0\"\0t\0o\0o\0l\0\x42\0\x61\0r\0\x43 \0h\0\x61\0n\0g\0\x65\0\x43\0\x61\0s\0\x65\x1\0\0\ x2\xb2\xff\xff\xff\xff\0\0\0\0\0\0\0\0\0\0\0(\0t\0 o\0o\0l\0\x42\0\x61\0r\0T\0\x65\0x\0t\0\x44\0i\0r\ 0\x65\0\x63\0t\0i\0o\0n\0\0\0\0\0\xff\xff\xff\xff\ 0\0\0\0\0\0\0\0\0\0\0\x2\0\0\0\x1\0\0\0\x18\0t\0o\ 0o\0l\0\x42\0\x61\0r\0\x43\0l\0i\0p\0s\0\0\0\0\0\x ff\xff\xff\xff\0\0\0\0\0\0\0\0)

8. I could also eliminate the error by keeping the original sigil.ini file in place and deleting &nbsp; from Preserve Entities. This probably worked because the toolbars string was modified in the process.

Thanks for reading.

- Mark
Attached Files
File Type: rar string.rar (621 Bytes, 164 views)
mrprobert is offline   Reply With Quote
Old 03-16-2019, 02:32 PM   #2
mrprobert
Member
mrprobert began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Mar 2017
Device: None
Yeah, never mind. It's probably related to the UTF-8 issues with 0.9.12.
mrprobert is offline   Reply With Quote
Advert
Old 03-16-2019, 03:28 PM   #3
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Also,
Code:
&nbsp;
isn't valid in EPUB3.

Code:
&160 ;
is how no-break spaces are represented in EPUB3.

In EPUB2, you were able to use the more human-readable entity names. But in EPUB3, you have to use the hex version only.
Tex2002ans is offline   Reply With Quote
Old 03-16-2019, 03:32 PM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,548
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
No it's not related to the utf-8 issue. It's related to named entities not being valid in EPUB3. If you're going to be primarily working with EPUB3, then make sure your Preserve Entities list is comprised only of numeric entities.
DiapDealer is offline   Reply With Quote
Old 03-16-2019, 08:55 PM   #5
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,377
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Epub3/HTML5 defines 5 character entities. If you use other character entities, they are flagged as errors though they will work in most cases.

The 5 permitted entities with added spaces are:
& quot ;
& apos ;
& amp ;
& lt ;
& gt ;

So say goodbye to the human readable nbsp and say hello to #160 or #xa0.

Last edited by DNSB; 03-16-2019 at 08:58 PM.
DNSB is offline   Reply With Quote
Advert
Old 03-16-2019, 10:33 PM   #6
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
I know it sounds silly, but html and xml allowed people to craft their own entity definitions and people actually crafted recursive entities that were used to attack websites and browsers. There is actually a lot of code to prevent evilly crafted named entities. The move to just numeric entities has made validating and expanding entities much easier and safer and help to restrict attack vectors.
KevinH is offline   Reply With Quote
Old 03-18-2019, 07:12 PM   #7
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,094
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
Quote:
Originally Posted by KevinH View Post
I know it sounds silly, but html and xml allowed people to craft their own entity definitions and people actually crafted recursive entities that were used to attack websites and browsers. There is actually a lot of code to prevent evilly crafted named entities. The move to just numeric entities has made validating and expanding entities much easier and safer and help to restrict attack vectors.
I was wondering why on earth they would go to a number system instead of a readable one... Thanks!

OBTW - Sigil probably already does this...can you auto switch from numbers to readable and then back when you are done editing?
Turtle91 is offline   Reply With Quote
Old 03-18-2019, 10:04 PM   #8
mrprobert
Member
mrprobert began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Mar 2017
Device: None
I understand the issue with named character entities, but I don't see this behavior with 0.9.10 set to preserve &nbsp; and with my particular toolbar string. It only occurs with 0.9.12.

For this test, note that I am not importing a file. The stub epub3 that sigil creates doesn't have an &nbsp;, so I shouldn't see: error on line 6 at column 10: Entity 'nbsp' not defined'. I don't get that message under 0.9.10 with the same settings.

Under 0.9.12, this goes away if I wipe the init file, even if I again set Create New or Empty Epubs as: Version 3 and Preserve Entities: &nbsp;. Mostly I am curious as to why this particular string causes sigil 0.9.12 to behave differently than with 0.9.10.
Attached Thumbnails
Click image for larger version

Name:	nbsp.0.9.12.jpg
Views:	1201
Size:	88.8 KB
ID:	170282  
mrprobert is offline   Reply With Quote
Old 03-18-2019, 11:20 PM   #9
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
A guess... differences in mend on open or save settings? Sigil does treat the nbsp entity in a different manner from any other entities. Sigil gumbo parser used by mend will automatically put the nbsp named entity into the page if a real nbsp char is found on epub2. In epub3, gumbo/mend will always put a nbsp numeric entity if the nbsp char is found.

This is to prevent the nbsp characters from being converted to normal spaces when being processed in a QPlainTextEdit widget which happens due to a "design choice" in Qt (I would call it a huge bug but).

That said, I am still not sure about how that image you postedcomes about. Perhaps your preserve entities are messed up in the sigil.ini file, or ?

What happens if you right click on that page and run Mend? When the Preview auto updates, does it still have the same error message?
KevinH is offline   Reply With Quote
Old 03-18-2019, 11:59 PM   #10
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,377
Karma: 145435140
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
What I found interesting is that the & nbsp ; is on line 9 but the error in the image refers to line 6 which is <title></title>.

Last edited by DNSB; 03-19-2019 at 12:02 AM.
DNSB is offline   Reply With Quote
Old 03-19-2019, 12:03 AM   #11
mrprobert
Member
mrprobert began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Mar 2017
Device: None
Same settings for all my tests. The Sigil version and the possibly-bogus toolbars string are the only variables. If I remove and restore the Preserve nbsp setting, the problem goes away, but the op also changes the string. Mending has no effect as far as I can tell (same message). Maybe it is a 64-bit Windows thing.

Attached is the sigil.ini file associated with my error.
Attached Files
File Type: rar sigil.ini.rar (4.1 KB, 147 views)
mrprobert is offline   Reply With Quote
Old 03-19-2019, 09:01 AM   #12
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,548
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by mrprobert View Post
I understand the issue with named character entities, but I don't see this behavior with 0.9.10 set to preserve &nbsp; and with my particular toolbar string.
You should be seeing see it. If you have &nbsp; in your Preserve entities list and you start a brand new EPUB3 (or open an existing EPUB3) and enter &nbsp; somewhere in Code View, you should immediately see the pink error window in preview. That should be the same with just about any version of sigil that supports creating both EPUB2 and EPUB3.

EDIT: Nevermind I think I see what you're saying now. As far as your ini file goes ... there's no Preserve Entities settings saved to it yet, anyway. Just stop using Sigil 0.9.11/12. It's not healthy. Generate completely new ini files with Sigil 0.9.10 and wait for the next release. Continuing to use the problem versions of Sigil just isn't worth the trouble.

Last edited by DiapDealer; 03-19-2019 at 09:13 AM.
DiapDealer is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
"Error while parsing file 'element "img" not allowed here" shotsfromthebar Editor 3 03-31-2017 03:53 AM
question about "user defined usb device" plugin Kaverin Plugins 2 03-08-2014 09:59 PM
"ELF binary type "0" not known" Error. When run Kindlegen chovan Amazon Kindle 4 02-09-2012 11:49 AM
Can't get rid of &nbsp "paragraphs" when converting Y|yukichigai Conversion 3 01-23-2012 11:20 PM
Touch "Updating Reading Life" = "Network error" m_bisson Kobo Reader 5 07-15-2011 01:05 AM


All times are GMT -4. The time now is 11:52 PM.


MobileRead.com is a privately owned, operated and funded community.