MobileRead Forums - View Single Post

DiapDealer · 08-08-2019, 01:09 PM

xmlprocessor.py also has examples of passing optional lists of relevant void tags to LXMLTreeBuilderForXML that are specific to xml file-types to assist in processing entire opf, ncx, and other xml files-types.

And the LXMLTreeBuilderForXML approach is probably overkill for simple epub metadata work. You can accomplish the same thing with:

Code:

from sigil_bs4 import BeautifulSoup 

metadata = bk.getmetadataxml()
metadata_soup = BeautifulSoup(metadata, "lxml-xml")
.
.
stir the xml soup
.
.
new_metadata = metadata_soup.decodexml(indent_level=0, formatter='minimal', indent_chars="  ")
# or new_metadata = metadata_soup.decodexml() if you don't care about prettying.

The point is to avoid html parsers and (x)html serializers.

08-08-2019, 01:09 PM	#18
DiapDealer Grand Sorcerer Posts: 28,939 Karma: 207302948 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD	xmlprocessor.py also has examples of passing optional lists of relevant void tags to LXMLTreeBuilderForXML that are specific to xml file-types to assist in processing entire opf, ncx, and other xml files-types. And the LXMLTreeBuilderForXML approach is probably overkill for simple epub metadata work. You can accomplish the same thing with: Code: from sigil_bs4 import BeautifulSoup metadata = bk.getmetadataxml() metadata_soup = BeautifulSoup(metadata, "lxml-xml") . . stir the xml soup . . new_metadata = metadata_soup.decodexml(indent_level=0, formatter='minimal', indent_chars=" ") # or new_metadata = metadata_soup.decodexml() if you don't care about prettying. The point is to avoid html parsers and (x)html serializers. Last edited by DiapDealer; 08-08-2019 at 04:34 PM.