![]() |
Problems with Beautifulsoup with custom tags
Hi!, i'm having troubles to add a custom tag with my plugin using Beautifulsoup:
The code: Code:
html = '<p id="nt3"><sup>[3]</sup> Note 1. <a href="../Text/Section0001.xhtml#nt3"><<</a></p>'Code:
$ python test.pyCode:
OUT:Thanks! PS: Using python 3.8 and Sigil 1.4.3 |
What are the double xml escaped "<" as part of the text for?
How are getting the OUT? If you print it from the plugin, it will pass through an xml encode xml decode pass when being returned from the plugin process over stdout as xml. So instead of printing to see this value, simply write to a log file from the plugin so you can see exactly what BeautifulSoup is generating. Here, my guess it is exactly identical to what you see outside, it is just getting unencoded passing back in the stdout xml file from the plugin. |
Quote:
Quote:
Quote:
Quote:
Quote:
Here its the exact code: Code:
#!/usr/bin/env python |
If you compare that to your first post you will see they are not the same. The printed output is showing the < < decoded when it should not be to be safely used.
The issue is you trying to assign an attribute as a dict. It is being converted to what is needed when run outside of the plugin environment but not inside. My guess is the default dict type is different. One may be an ordered dict collection while the other is not. Have you tried assigning that attribute in a different way? Sigil's internal bs4 version has many modifications to work on older Python 3 versions back to 3.4, so it may be using different types than a recent BS4 version that only runs on a limited set of Python3 versions. |
I did notice this:
Quote:
|
Here are alternative ways to add an attribute ...
Quote:
|
I took a peek at the latest BS4 source at launchpad and they have changed how they handle passing the attrs attribute.
So doing it in two steps will be more compliant with other versions of both bs4 and python3 implementations. |
Quote:
|
There is a fully html5 compliant gumbo parser already there as well as a very simple serial parser called quickparser in place, and there is also a html5lib parser as well that is guaranteed to be there in for use by Sigil plugins.
Surely one of those will do what you need. As for using bs4 as long as you split the new_tag creation from attribute addition in that piece, it does work on all versions of BS4 and back to Python 3.4. |
It (the colon) should just be a string when used as an attribute name.
tag["xml:lang"] = "la" to be more compatible with all version of BeautifulSoup |
Quote:
Thanks KevinH for your help. |
| All times are GMT -4. The time now is 08:39 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.