View Single Post
Old Yesterday, 03:53 PM   #15
Terisa de morgan
Grand Sorcerer
Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.Terisa de morgan ought to be getting tired of karma fortunes by now.
 
Terisa de morgan's Avatar
 
Posts: 6,677
Karma: 12595249
Join Date: Jun 2009
Location: Madrid, Spain
Device: Kobo Clara/Aura One/Forma,XiaoMI 5, iPad, Huawei MediaPad, YotaPhone 2
Quote:
Originally Posted by kovidgoyal View Post
RECOVER_PARSER is gone because of bugs in lxml in windows, https://bugs.launchpad.net/lxml/+bug/2125756. I dont know why any plugins would have been using it, the correct way to parse html is to use the parse_html function from calibre.oeb.parse_utils. But if plugins want to parse html or xml using lxml directly, the relevant functions are safe_xml_fromstring and safe_html_fromstring from the calibre.utils.xml_parse module. And if they really, really want to use RECOVER_PARSER then can simply define it themselves as

Code:
from lxml import etree
RECOVER_PARSER = etree.XMLParser(recover=True, no_network=True, resolve_entities=False)
Note that I strongly recommend against using RECOVER_PARSER as it is fundamentally broken thanks to the bug in lxml linked to above.
Checking some plugin, RECOVER_PARSER is not used for parsing html (it uses parse_html in that case) but for parsing xml. Is there any calibre function for that?
Terisa de morgan is offline   Reply With Quote