View Single Post
Old 10-01-2025, 08:20 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,558
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
RECOVER_PARSER is gone because of bugs in lxml in windows, https://bugs.launchpad.net/lxml/+bug/2125756. I dont know why any plugins would have been using it, the correct way to parse html is to use the parse_html function from calibre.oeb.parse_utils. But if plugins want to parse html or xml using lxml directly, the relevant functions are safe_xml_fromstring and safe_html_fromstring from the calibre.utils.xml_parse module. And if they really, really want to use RECOVER_PARSER then can simply define it themselves as

Code:
from lxml import etree
RECOVER_PARSER = etree.XMLParser(recover=True, no_network=True, resolve_entities=False)
Note that I strongly recommend against using RECOVER_PARSER as it is fundamentally broken thanks to the bug in lxml linked to above.

Last edited by kovidgoyal; 10-01-2025 at 08:27 PM.
kovidgoyal is offline   Reply With Quote