09-06-2021, 04:04 AM | #1 |
Member
Posts: 13
Karma: 10
Join Date: Jun 2008
Device: PRS-505
|
Plugins with large, native library dependencies
I'm interested in extending the duplicate finding plugin to do aproximate text/cover matches. I've done both outside of calibre in the past in python successfully.
However, to do so, there are substantial library dependencies: For text similarity (in a reasonable timeframe), I need a MinHashLSH implementation. I've used datasketch (https://github.com/ekzhu/datasketch) previously, which has a hard dependency on numpy (and I'd really like scipy, as MinHashLSH can use scipy for speeding up the initialization there). For Image similarity, a DCT based p-hash works very well. I can use a pure python DCT implementation, but scipy provides convenent DCT functions, and I need it anyways for MinHashLSH. What's the correct way for working on plugins like this? From what I've read there's no way to specify that your plugin has external dependencies (requirements.txt, etc...). Vendoring (and packaging) versions of all the packages for every platform is prohibitive. Last edited by fake-name; 09-06-2021 at 04:07 AM. |
09-06-2021, 07:21 AM | #2 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
There isnt any good way I'm afraid. I am really not going to get into the business of managing python package trees on user's computers, which would be what is needed for letting plugins specify external dependencies.
You basically will have to bundle the native code versions in your plugin zip file for all three platforms. |
09-07-2021, 01:37 AM | #3 |
Member
Posts: 13
Karma: 10
Join Date: Jun 2008
Device: PRS-505
|
Is there any reason that dependency management can't just be passed off to an existing tool like pip?
It seems like you could just have plugins define a requirements.txt (or something like setup.py) and have pip solve the managing issue. It might be a good idea to require pinning a specific package version, but I don't see a reason it wouldn't work. |
09-07-2021, 02:44 AM | #4 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Feel free to bundle pip with your plugin and use it to install your dependencies.
Just add the location you ask it to install to, to sys.path and you should be fine, assuming of course the incompatibilities between compilers versions used to build the native dependencies and the python calibreuses dont causes issues. In my experience python package management tools are all very poorly designed and extremely fragile, so I am not going to take on the responsibility for them in calibre. |
09-09-2021, 09:11 AM | #5 |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Recent versions of calibre support (in distro based system installs using the system python) a system directory for plugins. You could package your plugin in a .Deb or .rpm or .pkg.tar.zst and have it depend on both the distro calibre package and the distro scipy and datasketch python packages.
I'm afraid I don't have any ideas for non-distro handling... |
09-09-2021, 09:30 AM | #6 |
Evangelist
Posts: 415
Karma: 2666666
Join Date: Nov 2020
Device: none
|
I write the information of each package to a json file then call pip to install them and finally add the folder to sys.path. Please check out the code at https://github.com/xxyzz/WordDumb/bl...2/unzip.py#L51
FYI, mac can't load unsigned compiled library. |
09-09-2021, 09:50 AM | #7 |
Evangelist
Posts: 415
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Hi eschwartz. Could you explain why Arch Linux's calibre includes ".local/lib/python3.9/site-packages" in the "sys.path" and it's before "/usr/lib/python3.9/site-packages", won't this break dependencies if a user install an incompatible version of some package in their user directory?
I used to append the deps folder at the end of sys.path, then an Arch user with incompatible pkg installed had an import error. Insert the deps folder at the start of sys.path solves the error. |
09-09-2021, 02:29 PM | #8 |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
That's how python always behaves. It is a system python, how else is it supposed to support your custom scripts in $HOME/bin ??? This is no different from any other python.
This is why virtualenvs are generally recommended if you're going to install incompatible packages, which I assume means "old versions". |
10-10-2021, 03:58 AM | #9 | |
Member
Posts: 13
Karma: 10
Join Date: Jun 2008
Device: PRS-505
|
Quote:
I'm futzing with trying to get pip installed, but the fact that calibre ships a extremely cut-down `site` package is currently breaking it. Also, the fact that building calibre with tweaks in the bypy layer is completely undocumented (it's actually worse, the documentation is /wrong/, not missing) doesn't help |
|
10-11-2021, 12:22 AM | #10 | |
Evangelist
Posts: 415
Karma: 2666666
Join Date: Nov 2020
Device: none
|
Quote:
|
|
10-11-2021, 03:03 AM | #11 | |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
10-15-2021, 11:53 PM | #12 | ||
Member
Posts: 13
Karma: 10
Join Date: Jun 2008
Device: PRS-505
|
Quote:
Quote:
durr@calibvm ~> calibre-debug -m ensurepip Inspecting: ensurepip Traceback (most recent call last): File "runpy.py", line 194, in _run_module_as_main File "runpy.py", line 87, in _run_code File "site.py", line 45, in <module> File "site.py", line 41, in main File "calibre/debug.py", line 285, in main File "calibre/debug.py", line 249, in inspect_mobi File "calibre/ebooks/mobi/debug/main.py", line 18, in inspect_mobi FileNotFoundError: [Errno 2] No such file or directory: 'ensurepip' durr@calibvm ~ [1]> wget https://bootstrap.pypa.io/get-pip.py --2021-10-16 03:50:54-- https://bootstrap.pypa.io/get-pip.py Resolving bootstrap.pypa.io (bootstrap.pypa.io)... 2a04:4e42:600::175, 2a04:4e42:400::175, 2a04:4e42:200::175, ... Connecting to bootstrap.pypa.io (bootstrap.pypa.io)|2a04:4e42:600::175|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2158605 (2.1M) [text/x-python] Saving to: ‘get-pip.py’ get-pip.py 100%[================================================== ================================================== ======================>] 2.06M 11.9MB/s in 0.2s 2021-10-16 03:50:54 (11.9 MB/s) - ‘get-pip.py’ saved [2158605/2158605] durr@calibvm ~> calibre-debug get-pip.py Traceback (most recent call last): File "runpy.py", line 194, in _run_module_as_main File "runpy.py", line 87, in _run_code File "site.py", line 45, in <module> File "site.py", line 41, in main File "calibre/debug.py", line 336, in main File "calibre/debug.py", line 243, in run_script File "polyglot/builtins.py", line 110, in exec_path File "/home/durr/get-pip.py", line 27071, in <module> main() File "/home/durr/get-pip.py", line 139, in main bootstrap(tmpdir=tmpdir) File "/home/durr/get-pip.py", line 115, in bootstrap monkeypatch_for_cert(tmpdir) File "/home/durr/get-pip.py", line 96, in monkeypatch_for_cert from pip._internal.commands.install import InstallCommand File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/commands/__init__.py", line 9, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/cli/base_command.py", line 13, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/cli/cmdoptions.py", line 23, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/cli/parser.py", line 12, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/configuration.py", line 26, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/utils/logging.py", line 13, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/utils/misc.py", line 40, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/locations/__init__.py", line 14, in <module> File "zipimport.py", line 259, in load_module File "/tmp/tmpab00a5it/pip.zip/pip/_internal/locations/_distutils.py", line 10, in <module> File "bypy-importer.py", line 203, in exec_module File "distutils/command/install.py", line 18, in <module> ImportError: cannot import name 'USER_BASE' from 'site' (/opt/calibre/lib/calibre-extensions/python-lib.bypy.frozen/site.pyc) The documentation at https://github.com/kovidgoyal/calibr...ypy/README.rst seems to be out of date, and running the commands as specified does not work. Last edited by fake-name; 10-15-2021 at 11:55 PM. |
||
10-16-2021, 05:35 AM | #13 | |||
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
Quote:
Quote:
|
|||
10-22-2021, 02:20 AM | #14 | |
Member
Posts: 13
Karma: 10
Join Date: Jun 2008
Device: PRS-505
|
If I could get calibre going from source, I'd give it a shot.
Quote:
On a clean Ubuntu 20.04 install (it's the same on 18.04, 16.04 is too old and it explodes for different reasons): Code:
durr@ubuntu ~> mkdir tmp durr@ubuntu ~> cd tmp durr@ubuntu ~/tmp> git clone https://github.com/kovidgoyal/bypy.git Cloning into 'bypy'... remote: Enumerating objects: 1831, done. remote: Counting objects: 100% (268/268), done. remote: Compressing objects: 100% (178/178), done. remote: Total 1831 (delta 180), reused 169 (delta 90), pack-reused 1563 Receiving objects: 100% (1831/1831), 559.23 KiB | 2.55 MiB/s, done. Resolving deltas: 100% (1184/1184), done. durr@ubuntu ~/tmp> git clone https://github.com/kovidgoyal/calibre.git Cloning into 'calibre'... remote: Enumerating objects: 356412, done. remote: Counting objects: 100% (3306/3306), done. remote: Compressing objects: 100% (1765/1765), done. remote: Total 356412 (delta 1918), reused 2273 (delta 1540), pack-reused 353106 Receiving objects: 100% (356412/356412), 270.89 MiB | 16.14 MiB/s, done. Resolving deltas: 100% (286504/286504), done. durr@ubuntu ~/tmp> cd calibre durr@ubuntu ~/t/calibre (master)> ./setup.py bootstrap /usr/bin/env: ‘python’: No such file or directory durr@ubuntu ~/t/calibre (master) [127]> Code:
durr@ubuntu ~/t/calibre (master) [127]> python3 setup.py bootstrap Cloning into 'translations'... remote: Enumerating objects: 173431, done. remote: Counting objects: 100% (5704/5704), done. remote: Compressing objects: 100% (1493/1493), done. remote: Total 173431 (delta 5316), reused 4447 (delta 4211), pack-reused 167727 Receiving objects: 100% (173431/173431), 835.29 MiB | 18.02 MiB/s, done. Resolving deltas: 100% (169792/169792), done. Updating files: 100% (4098/4098), done. * * Running build * Traceback (most recent call last): File "setup.py", line 119, in <module> sys.exit(main()) File "setup.py", line 104, in main command.run_all(opts) File "/home/durr/tmp/calibre/setup/__init__.py", line 233, in run_all self.run_cmd(self, opts) File "/home/durr/tmp/calibre/setup/__init__.py", line 223, in run_cmd self.run_cmd(scmd, opts) File "/home/durr/tmp/calibre/setup/__init__.py", line 227, in run_cmd cmd.run(opts) File "/home/durr/tmp/calibre/setup/build.py", line 296, in run self.env = init_env(debug=opts.debug) File "/home/durr/tmp/calibre/setup/build.py", line 171, in init_env from setup.build_environment import win_ld, is64bit, win_inc, win_lib, NMAKE, win_cc File "/home/durr/tmp/calibre/setup/build_environment.py", line 108, in <module> qraw = subprocess.check_output([QMAKE, '-query']).decode('utf-8') File "/usr/lib/python3.8/subprocess.py", line 415, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, File "/usr/lib/python3.8/subprocess.py", line 493, in run with Popen(*popenargs, **kwargs) as process: File "/usr/lib/python3.8/subprocess.py", line 858, in __init__ self._execute_child(args, executable, preexec_fn, close_fds, File "/usr/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'qmake' In any event, following the exact steps as documented don't work. It's either out of date, or was never correct in the first place. It's possible you don't need to run the "bootstrap" step first, but in that case, I don't know what bootstrap is supposed to mean, it generally indicates it's the first thing that needs to be done. |
|
10-22-2021, 02:54 AM | #15 |
creator of calibre
Posts: 43,912
Karma: 22669818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Read the error message, you are missing qmake.
FileNotFoundError: [Errno 2] No such file or directory: 'qmake' Those instructions are not meant for spoon feeding. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Systemwide installation of plugins / Discovery of installed plugins via pkg_resources | t-8ch | Development | 8 | 11-14-2020 09:25 AM |
Slow with large library | luoto | Calibre | 17 | 02-06-2020 04:44 PM |
Maintenance large library | tomcooke | General Discussions | 11 | 06-11-2018 03:23 PM |
Best way to get a large Calibre library into the PE library? | Filark | enTourage Archive | 0 | 04-20-2011 10:18 PM |
Networked db/large library a no-go? | concern | Calibre | 3 | 02-03-2010 09:11 PM |