01-08-2020, 03:26 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Jan 2020
Device: none
|
Bug Fix 1857800
Hi all,
Regarding the latest changelog: *************** Viewer: Fix a bug that could allow maliciously crafted EPUB files to read data from files on the computer. Thanks to dozernz for discovering this attack vector. Closes tickets: 1857800 *************** Can anyone clarify this? I was under the impression that EPUB, etc were just benign files with formatting tags. How could an EPUB (via a reader) begin scanning non-related files? |
01-08-2020, 03:42 AM | #2 |
Evangelist
Posts: 482
Karma: 2267928
Join Date: Nov 2015
Device: none
|
Via external entities, but I fail to see what danger it can pose.
|
01-08-2020, 04:12 AM | #3 |
Junior Member
Posts: 3
Karma: 10
Join Date: Jan 2020
Device: none
|
Someone here suggested (quite rightly) that if there were links to 'bad actor' sites then it is these that pose the risk. Another suggested that EPUB files can contain javascript code and this might also be an issue, although I cannot find anything that hints at EPUB files containing such code.
|
01-08-2020, 04:18 AM | #4 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The problem was in lxml https://bugs.launchpad.net/lxml/+bug/1742885
which calibre uses to parse EPUB files. It allows injection of arbitrary file content into the parsed EPUB. The parsed EPUB in turn can have javascript which is run in the viewer (in a sandbox), and that javascript can access the parsed content despite being sandoxed because it is part of the parsed book contents. |
01-08-2020, 10:40 AM | #5 |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi Kovid,
I understand the best way to workaround this with lxml is to always add the resolve_entities=False. I can do that in lxml use in Sigil and by the _lxml builder code in BS4. Under html5, all named entities are illegal except for the basic xml ones so no issues but error out? But under epub2 xhtml, what is the consensus best way to handle this: 1. Do we just NOT try and resolve the Entities and allow the problem case through? or 2. Remove the custom entity definition? If we decide to remove the custom entity definition itself, then how do we to best deal with the entity references to that custom entity? - Should they be removed? - Left in place but with no definition - Should they be replaced by a placeholder string - Should they be disarmed by removing their & and ; Perhaps simply wrapping the custom entity declaration in an xhtml comment is enough, letting every thing else alone? Thanks, KevinH |
01-08-2020, 10:48 AM | #6 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Well, the proper solution is not resolve_entities=False but setting up a custom resolver as described here: https://lxml.de/resolvers.html
I didn't have time to do that for the previous release, something for my TODO list. |
01-08-2020, 11:03 AM | #7 |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Sigil has in the past had its own resolve routine in xhtml source for epub2 which assumed a simple macro-like substitution approach with no other expansion.
I will most probably handle this in epub2 xhtml by commenting out the custom entity definition and leaving the resulting named entities in place for the user to decide if they want to resolve it manually or not. Seems like the simplest and safest approach. I have seen very very few epub2's with custom entity declarations outside of the dtd supplied ones in the wild. Thanks |
01-08-2020, 11:34 AM | #8 |
creator of calibre
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That should do it: https://github.com/kovidgoyal/calibr...e8b18a13187ded
|
01-08-2020, 02:59 PM | #9 |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Thanks, that will help a lot.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
bug fix in Catalog | Katja_hbg | Development | 2 | 01-13-2019 10:22 PM |
How to sort a bug fix | scarlettruin | Calibre | 7 | 04-03-2015 10:13 PM |
Minor bug fix release now available? | borisb | enTourage Archive | 6 | 05-24-2010 03:05 PM |
eScape Bug Fix Release | Munish | ePub | 0 | 03-23-2009 08:46 AM |
iLiad Bug report with fix to iRex | scotty1024 | iRex Developer's Corner | 0 | 10-21-2006 05:15 PM |