|
|
#1 |
|
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,782
Karma: 2209206
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Battling with jobs and ebooks\oeb\iterator
Hi - the Count Pages plugin has gone into a meltdown ever since I foolishly thought I could trivially split its execution into multiple jobs instead of one big job.
The point of running them in smaller jobs was that if a user decided to count pages on 1000 books, then realised they had to turn their computer off, at least with a default batch size of 50 then they would have got "some" of their books updated by that point, rather than all the results so far being lost. The problem I am seeing in hindsight is that Calibre is running multiple of those jobs in parallel. Which all starts off fine initially until the jobs start spitting out errors like this: Code:
Traceback (most recent call last):
File "calibre_plugins.count_pages.jobs", line 191, in do_statistics_for_book
File "calibre_plugins.count_pages.statistics", line 72, in get_page_count
File "calibre_plugins.count_pages.statistics", line 114, in _open_epub_file
File "calibre\ebooks\oeb\iterator\book.py", line 147, in __enter__
File "calibre\ebooks\oeb\iterator\book.py", line 87, in run_extract_book
File "calibre\utils\ipc\simple_worker.py", line 251, in fork_job
File "calibre\utils\ipc\simple_worker.py", line 176, in run_job
File "calibre\utils\ipc\simple_worker.py", line 119, in communicate
calibre.utils.ipc.simple_worker.WorkerError: Worker failed
Code:
server = Server(pool_size=cpus) Does Kovid or anyone who has used the jobs stuff have any suggestions? The code is right there if you want to experiment with the latest release. If I get no joy with it I will just revert the feature. I am so out of touch with the calibre code and no time to try to properly learn it, most definitely this code would have been copied from calibre somewhere about 15 years ago! |
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 46,158
Karma: 29626604
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Definitely looks like resource exhaustion. Presumably you are forking workers in each worker of which there are itself N copies running in parallel giving N * N processes. Many OSes have ridicuolusly low limits on how many open file handles and so on by default. I suggest you run one job per worker and let the jobs system handle queueing the work for you.
|
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,782
Karma: 2209206
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Thanks Kovid!
Actually @sgmoore on the Count Pages thread has put me onto what the root cause issue is that AI couldn't figure out (and I was too lazy with limited time last night to look myself). The code was only creating one temporary directory for all the books, then when the first job completes that was getting removed. So the iterator is actually failing due to the folder no longer existing - the reason being it has been deleted, not that resource handles had run out... One of those slap forehead moments. Maybe Claude Code could have figured that out, but the "freebie" AI agents continue to be underwhelming. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| 8.12 breaks from calibre.ebooks.oeb.parse_utils import RECOVER_PARSER | JSWolf | Calibre | 25 | 10-05-2025 04:51 AM |
| Unutterably Silly Battling Proverbs | pdurrant | Lounge | 47 | 03-31-2020 08:18 AM |
| Using calibre.ebooks.oeb.polish.container in a driver | davidfor | Development | 10 | 07-26-2013 03:02 AM |
| Unutterably Silly Battling Proverbs! | pdurrant | Lounge | 54 | 12-25-2012 04:57 PM |