Thread: Buglet?
View Single Post
Old 10-06-2020, 08:50 PM   #10
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,475
Karma: 5703586
Join Date: Nov 2009
Device: many
Okay I checked the python3lib sanitycheck.py code and it will treat "<<p>" as a spurious text "<" followed by a tag. And it will treat "</p>>" or "<p>>" as a tag followed by a spurious text ">".

I could detect both cases by verifying that the text returned from parsing does not contains an illegal > or < char when not a child of a CDATA tag.

So making sanity check detect these cases is doable. I will look into doing that.

FWIW, HTML5 parsing rules only require xml escaping a ">" in text if it would be considered to result in ambiguous parsing. Whereas the "<" character should always be xml escaped when used in attribute values and text. Under XHTML, both characters should always be xml escaped when used inside attribute values and text fields.

Last edited by KevinH; 10-06-2020 at 08:59 PM. Reason: updating
KevinH is online now   Reply With Quote