Here is the problem, which I had to navigate myself while making Edit Book for calibre:
1) Qt internally converts the unicode nbsp character u+00a0 into normal spaces, when extracting text from QPlainTextEdit. This can be worked around by overriding appropriate methods in the QPlainTextEdit sub-class
2) Instead of doing the overwriting Sigil chose to convert unicode nbsp into to workaround the bug
3) Named entities like are invalid in XHTML without a proper doctype
4) Therefore Sigil before 0.7.3 could take a valid epub and by converting the nbsp characters into named entities render it invalid (if it did not declare a doctype, which is optional and many epubs do not). Of course this "invalidity" was only with respect to the useless epubcheck. The epubs continued to work fine in all actual epub readers.
5) This was fixed by converting the nbsp characters into *numeric* instead of named entities in Sigil 0.7.3
The best approach is of course to fix Qt as I did for Edit Book, but I dont know if any Sigil developers have any interest in doing that.
|