Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 04-23-2023, 04:00 PM   #1
lrpirlet
Connoisseur
lrpirlet began at the beginning.
 
Posts: 93
Karma: 40
Join Date: Mar 2020
Location: Belgium (sorry, I am from the Walloon side of the country and I speak french only)
Device: PW3, Kobo Libra H2O
I need to know the value of worker_limit from within a metadata source plugin

While getting the metadata from Babelio, I sometimes hit their Denial of Service -DoS- detection that ends up banning my ip address for a few days...

I find it very annoying

I have come with various schemes and I think I found a solution to keep each request distributed in time so that no DoS is triggered... provided that those requests are in the same thread.

When I try to get several metadata for the same authors and part of title, I may get many workers working in parallel... and all submitted at the same time leading to many batch of simultaneous requests submitted to Babelio.

Solution would be to set, at the begin of the worker.py, a time.sleep() proportional to the worker number and the worker_limit (set at calibre top level in Preference - Miscellaneous).
Code:
(worker number)%(worker_limit/2)
I think the concept is working, I used cpu_count instead of worker_limit/2 and it work if worker_limit is >= CPU_count...
I think I would use num as in ui.py under
Code:
def create_spare_pool(self, *args):
        if self._spare_pool is None:
            num = min(detect_ncpus(), config['worker_limit']//2)
            self._spare_pool = Pool(max_workers=num, name='GUIPool')
Of course access to max_workers would be even better...

How do I process?

I fear to be forced to read the config... BTW if that is the only solution, where is it logically from inside calibre???

Thanks in advance.
lrpirlet is offline   Reply With Quote
Old 04-23-2023, 10:31 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,000
Karma: 22669822
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use rate_limit() from sources/search_engines.py
kovidgoyal is offline   Reply With Quote
Old 04-24-2023, 05:57 AM   #3
lrpirlet
Connoisseur
lrpirlet began at the beginning.
 
Posts: 93
Karma: 40
Join Date: Mar 2020
Location: Belgium (sorry, I am from the Walloon side of the country and I speak french only)
Device: PW3, Kobo Libra H2O
Quote:
Originally Posted by kovidgoyal View Post
Use rate_limit() from sources/search_engines.py
Thanks a lot. This looks even better than what I asked.

If I understand correctly after my short overview, I could get a time between visit for each and every babelio access, be it in the same process or not... and that in whatever os... Superb.

Now, time to implement and carefully test (to avoid banning )
lrpirlet is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Metadata Source Plugin] BiblioNETka.pl fenuks Plugins 8 01-15-2024 01:19 PM
[Metadata Source Plugin] Smashwords davidfor Plugins 15 04-29-2023 08:40 PM
How do I know which Metadata Source Plugin was called from fetch-ebook-metadata? eryMpexI Library Management 6 04-22-2023 08:58 AM
Read a book's metadata in a Metadata source plugin? J-H Development 2 03-30-2021 09:08 AM
[Metadata Source Plugin] Empty Plugin? (Fake Identifier) mneimeyer Plugins 3 11-11-2019 08:07 PM


All times are GMT -4. The time now is 10:53 AM.


MobileRead.com is a privately owned, operated and funded community.