Looking at all that, I suspect that reducing the number of threads in the pool and increasing the timeout would do it. But, the documentation shows that the defaults for the number of threads changed in Python 3.8 to "min(32, os.cpu_count() + 4)". And I don't know why @jgoguen chose 10 seconds. I can see that an actual runtime of 10 seconds for any of these threads would mean a large file was being processed, but, a book with a lot of files could get some timing out as you say.
Looking at the code, I agree that the loop over the futures should be inside the with. But, I'm not sure what that would help.
And I agree that the sample you have shown would be better. I'll have a look at it. I need to get the code in the beta out of the way first. Not that they interfere, just that I haven't gotten around to committing it and raising the pull request.
|