![]() |
#1 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
strange interaction with .so files and __file__ in plugins
I encountered a strange problem while working on a plugin and I'm not sure if it indicates a bug in Calibre or just something to look out for or do differently.
I had two library packages embedded in the plugin zip. One of them uses os.path.dirname(__file__) to get the path of a resource file in the package's directory that it reads at runtime. Debugging logging showed me that in normal operation __file__ has as its prefix the path to the plugin's zip file, and the package has no problem with the I/O routines reading from it that way. When I added the second library package to the plugin's tree, I found that in the first package __file__ now contained a path in a temporary directory on disk, one which did not exist at runtime, As a result the attempt to load the resource file failed. Experimenting, I found that for some reason the second package contains a file with extension .so. Even without that library, if I simply create any file with a .so extension in its name, anywhere in the plugin zip directory tree, with any contents even zero length, then Calibre loads the plugin in such a way that __file__ is a temporary directory that does not exist at runtime instead of being a path to the zip file. I saw this on Calibre 5.8.1 and still see it on 5.9, on macOS Catalina. I haven't tried Windows or Linux. Does Calibre have a heuristic in which it unzips some plugins if it sees it has certain file types such as *.so which maybe can't be used from a zip? Are there some uses of __file__ that a python library might use but could break when the library is used in a plugin? Last edited by bugstomper; 01-10-2021 at 06:06 AM. |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Yes native code cant be loaded from zip files so they have to be unzipped. And to load resources from plugin zip files, use the calibre provided APIs https://manual.calibre-ebook.com/cre...lugin-zip-file
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
I can see why Calibre would unzip such a plugin, and it might not be worth having it recognize that a .so file is only a binary resource on a linux platform, .dylib for macOS, dll for Windows, etc., but why would the directory that __file__ points to not exist at runtime if that is the directory that Calibre unzips the plugin to?
The problem that results from this is that these are two third party libraries. Perhaps I would be stuck patching one or the other of them to use the Calibre plugin API, but is there a reason why the use of __file__ should not work if Calibre does unzip to a directory and __file__ does point to that directory? If the .so file does need to be on the disk, how does that work if the directory no longer exists? |
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Actually, I take it back, I was mis-remembering, calibre doesnt auto-unzip plugin zip files. The plugin has to choose to do that for itself, if it wants to.
And the third party libraries should really be fixed to use the importlib.resources stdlib module to load its resources. Indeed, I dont see how your third party library could possibly be using __file__ from a zip file successfully. This discussion would be a lot easier with some actual code. |
![]() |
![]() |
![]() |
#5 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
The easiest way to reproduce this before I write some artificial test case is using the plugin where I saw it more in the wild. The plugin FanFicFare currently includes the open source version of the Python library cloudscraper. You would need to get a copy of FanFicFare now, because for reasons about to become obvious it is likely about to remove cloudscraper. If you install FanFicFare in Calibre, it will add a UI element to the menu toolbar. Use its menu "Download from URLs", giving it the URL of any story you wish that is from fanfiction.net. That will probably fail with an error that mentions seeing a version 2 challenge, which means the attempt was blocked by an advanced level of protection from Cloudflare. That error comes from cloudscraper, is now the expected result, and part of why cloudscraper and access to fanfiction.net will likely be removed from FanFicFare shortly. You can see the error in the job's details or by running calibre-debug for it. If it doesn't fail on that attempt, that is ok too, just that lately the error is the more common result.
The current release version of FanFicFare.zip is always in a link in the first post of the thread https://www.mobileread.com/forums/sh...d.php?t=259221 (The next part is something I only tried on my machine which runs macOS Catalina. I have not seen if it reproduces on Windows or Linux) Unzip FanFicFare.zip, create a foo.so file in its root directory. Contents don't matter, even length 0 using touch, or even in a subdirectories you create, none of that matter, make a zip file with it, and reinstall that in Calibre. This time when you try to download a story from fanfiction.net it should fail with a different error Code:
No such file or directory: '/var/folders/j_/l6_c445j0v7gy0dw7y35jq240000gn/C/calibre_5.9.0_tmp_dsx7opbm/i0zmj1_1plugin_unzip/cloudscraper/user_agent/browsers.json' I tried patching the code to replace cloudscraper/user_agent/browsers.json with a browsers.py I wrote that has a def browsers() that returns a string and calling json.loads on it instead of json.load on the file. When I do that, trying to get a story from fanfiction.net gets past there to a new error, which is in the included requests library when it tries to read a root certificates file trying to access an https URL Code:
Could not find a suitable TLS CA certificate bundle, invalid path: /var/folders/j_/l6_c445j0v7gy0dw7y35jq240000gn/C/calibre_5.9.0_tmp_727zsud6/w19l8s4xplugin_unzip/certifi/cacert.pem I found that without adding the foo.so it shows FanFicFare.zip as the root of the path, and with foo.so existing it shows that /var/folders/... path on the disk. Debugging logging also showed that at the time of the error the /var/folders/... path specified by __file__ does not exist. It might be easy to dismiss this as something that cloudscraper/user_agent should not be doing with __file__ but the fact that a more common package like requests fails when accessing https concerns me. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Ah, that gives me some more information. The plugin is using
with self: pass to unzip itself. Relevant code is here: https://github.com/kovidgoyal/calibr...init__.py#L279 As for requests, it is a pretty old and poorly designed library, so it doesnt surprise me that it does stupid things like using __file__ instead of the correct stdlib facilities for loading resources. That said this should indeed not break in calibre and was probably broken by the changes to the plugin loader system to support python 3. And just FYI, if you want browser user agents, a good way to get them is from calibre import random_user_agent and if you want https certificates, the way to get a path to them is P('mozilla-ca-certs.pem') No idea if these libraries allow you to pass such things into them, however. |
![]() |
![]() |
![]() |
#7 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
And, this error is likely caused by fanficfare doing
Code:
with self: import whatever use whatever outside the with statement. |
![]() |
![]() |
![]() |
#8 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
The two places that FanFicFare uses with self: are in the def load_actual_plugin and def cli_main in its __init__.py. Is it wrong for load_actual_plugin to return its result calculated inside with self: where it will be used outside of the scope of the with self:? If it should use self.__enter__() and self:__exit__() where would they go to make sure it works?
Also, if that is wrong, what is going on with the problem only showing up when a .so file exists? I looked for all uses of __file__ in files in the plugin and found that the one for certificates is in file certifi/core.py. That is very short and simple. It looks like it first tries to use importlib.resources and falls back to __file__ if it gets an ImportError exception. Is there something wrong with the call to get_path or __enter__ which would interact with whatever happens when there is a .so file in the plugin zip? |
![]() |
![]() |
![]() |
#9 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
Oh, I see how certifi/core.py could interact with this problem. If its where() method is called for the first time in a state where get_path returns the temporary directory, and then the temporary directory is deleted, subsequent calls to where() will fail in just this way. It does not have to have anything to do with the code in it that uses __file__ as long as it is called for the first time in a context in which get_path really is supposed to return the temporary directory.
|
![]() |
![]() |
![]() |
#10 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I suggest you simply monkey patch either certifi or requests to use the certs file bundled with calibre, the path to which is simply P('mozilla-ca-certs.pem')
|
![]() |
![]() |
![]() |
#11 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
Patching certifi or requests like that makes sense. Sorry if this is a newbie question, but Googling didn't help... What does P('mozilla-ca-certs.pem') mean? It looks like a function named P but I can't find documentation for it and it is hard to search for.
Also, do you have an answer for my question about load_actual_plugin? The entire definition for it in FanFicFare's __init__.py is simply Code:
def load_actual_plugin(self, gui): with self: # so the sys.path was modified while loading the # plug impl. return InterfaceActionBase.load_actual_plugin(self,gui) |
![]() |
![]() |
![]() |
#12 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It's a calibre specific function defined globally. Just use it as is, it will always return the path to the specified file. If you want to see its implementation look in utils/resources.py in the calibre code.
Yes, if you return from inside with, __exit__ is called before the function returns. As for where to call __enter__ call it in load_actual_plugin, and there's no need to call __exit__, since it will be cleaned up automatically on process exit. However, I recommend monkey-patching over this, since this approach affects global state, and therefore all other code running in the same process. |
![]() |
![]() |
![]() |
#13 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jun 2017
Device: kindle
|
Thank you so much for being so responsive to questions from a newbie to the Calibre code! I passed along the information to the actual dev of FanFicFare and they have fixed it in certifi and requests.
|
![]() |
![]() |
![]() |
#14 |
Plugin Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,969
Karma: 4604635
Join Date: Dec 2011
Location: Midwest USA
Device: Kobo Clara Colour running KOReader
|
I did not really think about plugins containing binaries libraries. If it's reasonably do-able while still being cross platform, I might want to.
Are there any plugins that would be good examples that contain bundled compiled python modules? The Alf plugin is the only one I could find with binary libs but I don't think it's representative. |
![]() |
![]() |
![]() |
#15 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
it's do-able, bu you have to bundle versions of the libraries for each platform. I dont think anyone other than alf has ever bothered.
|
![]() |
![]() |
![]() |
Tags |
plugin __file__ |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Systemwide installation of plugins / Discovery of installed plugins via pkg_resources | t-8ch | Development | 8 | 11-14-2020 09:25 AM |
Aura HD Two strange GUID txt files have appeared on my Kobo | Nick Payne | Kobo Reader | 2 | 01-15-2018 09:34 PM |
Strange Problem with files/folders | Gospod | Library Management | 2 | 08-26-2015 03:51 AM |
Interaction | umarramzan47 | Introduce Yourself | 2 | 12-11-2011 05:29 PM |
CBZ plugin interaction with FB2/DjVu plugins - First try at pinpointing | MrWarper | iRex | 2 | 12-11-2011 05:46 AM |