I updated to the calibre 7.18.100 preview and Modify ePub threw errors when I tried to use it on several ePubs. Traced it to enabling Smarten Punctuation which run into an error importing substitute_entites from chardet.py. That bit seems to have been removed from calibre's code in chardet.py.
I looked at the 7.18 chardet.py and the 7.18.100 chardet.py and added back in the code around substitute_entites (before _CHARSET_ALIASES) and ENTITY_PATTERN (after lazy_encoding_pats)
lazy_encoding_pats = LazyEncodingPats()
==> added: ENTITY_PATTERN = re.compile(r'&(\S+?);')
and
==> added: def substitute_entites(raw):
==> added: from calibre import xml_entity_to_unicode
==> added: return ENTITY_PATTERN.sub(xml_entity_to_unicode, raw)
_CHARSET_ALIASES = {"macintosh" : "mac-roman", "x-sjis" : "shift-jis"}
Last edited by DNSB; 09-20-2024 at 12:55 AM.
Reason: Modified charset.py to allow importing substitute_entites
|