The problem is that the BeautifulSoup 4 module is now essentially a Python package, i.e. a folder with an "__init__.py" file in it. The files within make certain assumptions about the environment, in this case specifically that "bs4" is directly importable and that the Python path points to the package in some way.
For example an import like this would fail:
Code:
import packages.bs4
While I could try and modify the code to use relative imports, I doubt that this is something that I will only encounter once and it strikes me as a bit unclean. So I decided to modify the Python path by inserting values into "sys.path" at runtime, which also allows for nicer code in the project's "adapter" modules which take care of the site-specific scraping.
I read up a bit and saw that Calibre's Plugin class implements "__enter__" and "__exit__" magic methods, where Python path modifications are made and removed respectively. If the module is "zipsafe" the path is simply modified to point directly to the *.zip plugin file, if not the plugin is unpacked into a temporary directory and the path pointing to it added to the Python path. Using the "sys_insertion_path" is unfortunately not an option, since the module that contains the Plugin subclass imports modules that might already expect the Python path to be modified by that point and raise ImportErrors. So I make this modification myself in the InterfaceActionBase subclass:
Code:
def initialize(self):
packages_path = os.path.join(self.plugin_path, 'packages')
sys.path.insert(0, packages_path)
This allows the Plugin to work when used via Calibre's Command-line interface, and via the GUI since it's the main entry point of the Calibre plugin. The concern now is, since I am modifying the path directly and using packages that Calibre itself uses in its own code, if Calibre might accidentally import unexpected versions of the packages. And I'd really prefer to use the "sys_insertion_path" attribute, since it already takes care of an important bit of logic, but I'm not sure how that should be possible.
Basically I am looking for advice on how to best include third-party Python modules and packages and making them directly importable in subsequently called code, without having to use horrible Python path hacks and possibly changing Calibre's internal state unexpectedly.