You use the gumbo_bs4_adapter.py not soup directly. I will run a test to see how it handles it.
Here is an example of using it:
Code:
# examples for using the bs4/gumbo parser to process xhtml
import sigil_bs4
import sigil_gumbo_bs4_adapter as gumbo_bs4
samp = """
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml/" xml:lang="en" lang="en-US">
<head><title>testing & entities</title></head>
<body>
<p class="first second">this is the <i><b>copyright</i></b> symbol "©"</p>
<p xmlns:xlink="http://www.w3.org/xlink" class="second" xlink:href="http://www.ggogle.com">this used to test atribute namespaces</p>
</body>
</html>
"""
soup = gumbo_bs4.parse(samp)
for node in soup.find_all(attrs={'class':'second'}):
print(node)
print(soup.serialize_xhtml());”