MobileRead Forums - View Single Post - I need to know the value of worker_limit from within a metadata source plugin

lrpirlet · 04-23-2023, 04:00 PM

While getting the metadata from Babelio, I sometimes hit their Denial of Service -DoS- detection that ends up banning my ip address for a few days...

I find it very annoying

I have come with various schemes and I think I found a solution to keep each request distributed in time so that no DoS is triggered... provided that those requests are in the same thread.

When I try to get several metadata for the same authors and part of title, I may get many workers working in parallel... and all submitted at the same time leading to many batch of simultaneous requests submitted to Babelio.

Solution would be to set, at the begin of the worker.py, a time.sleep() proportional to the worker number and the worker_limit (set at calibre top level in Preference - Miscellaneous).

Code:

(worker number)%(worker_limit/2)

I think the concept is working, I used cpu_count instead of worker_limit/2 and it work if worker_limit is >= CPU_count...
I think I would use num as in ui.py under

Code:

def create_spare_pool(self, *args):
        if self._spare_pool is None:
            num = min(detect_ncpus(), config['worker_limit']//2)
            self._spare_pool = Pool(max_workers=num, name='GUIPool')

Of course access to max_workers would be even better...

How do I process?

I fear to be forced to read the config... BTW if that is the only solution, where is it logically from inside calibre???

Thanks in advance.

04-23-2023, 04:00 PM	#1
lrpirlet Zealot Posts: 100 Karma: 40 Join Date: Mar 2020 Location: Belgium (sorry, I am from the Walloon side of the country and I speak french only) Device: PW3, Kobo Libra H2O	I need to know the value of worker_limit from within a metadata source plugin While getting the metadata from Babelio, I sometimes hit their Denial of Service -DoS- detection that ends up banning my ip address for a few days... I find it very annoying I have come with various schemes and I think I found a solution to keep each request distributed in time so that no DoS is triggered... provided that those requests are in the same thread. When I try to get several metadata for the same authors and part of title, I may get many workers working in parallel... and all submitted at the same time leading to many batch of simultaneous requests submitted to Babelio. Solution would be to set, at the begin of the worker.py, a time.sleep() proportional to the worker number and the worker_limit (set at calibre top level in Preference - Miscellaneous). Code: (worker number)%(worker_limit/2) I think the concept is working, I used cpu_count instead of worker_limit/2 and it work if worker_limit is >= CPU_count... I think I would use num as in ui.py under Code: def create_spare_pool(self, *args): if self._spare_pool is None: num = min(detect_ncpus(), config['worker_limit']//2) self._spare_pool = Pool(max_workers=num, name='GUIPool') Of course access to max_workers would be even better... How do I process? I fear to be forced to read the config... BTW if that is the only solution, where is it logically from inside calibre??? Thanks in advance.