Quote:
Originally Posted by DiapDealer
So far as I understand, the DOCTYPE is not a requirement of the ePub spec (but it certainly needs to be correct if it IS present). So since Bob's attached, sample ePub is entirely compliant before 0.7.3 touches it, shouldn't the question be; "Why is Sigil inserting entities (by converting nbsp characters) into a document when doing so will make said document become 'not well formed?'"
I understand the reasoning behind wanting to preserve the intent of the non-breaking space character (which 0.7.2 did not do at all), but the fix shouldn't really come at the cost of making otherwise valid epubs, invalid. Should it?
I don't know. This is a bit of weird one. I always have the DOCTYPE in my files, myself, so this issue doesn't really affect me, but still... it seems like it's a bit of a catch-22.
|
Ideally the "fix" shouldn't cause the epub to become invalid. But unfortunately we seem to be stuck with that effect in this case. It's just a limitation of the Qt code Sigil is using to edit the HTML files: that it replaces nbsp characters with normal spaces.
The alternative, before 0.7.3, was that if you had an nbsp character in a UTF8 file, Sigil would remove it and replace it with a normal space - regardless of cleaning settings. So you were definitely losing information (enough that at least some people wanted it fixed).
Now, Sigil is preserving the nbsp character but to do so it has to convert it from a character to an entity so it doesn't get lost. For files with DOCTYPE already defined it isn't an issue. But in files that don't have the DOCTYPE set it means you either need to manually add the DOCTYPE or allow Sigil to clean the file. You still have the issue that if you manually insert an nbsp character in Sigil (not entity) it will immediately become a normal space. I think the biggest issue was not knowing why it was suddenly giving the error - at least now it's a little clearer why the error message is shown.