A single metadata download has to happen in two stages, the first stage identify and then the second stage download the covers based on the identify results. On my, rather slow internet connection, neither stage takes more than four seconds. I dont think making it background-able is useful given that you would have to be switching contexts every few seconds. You will find the cognitive overhead will swamp any efficiency gains.
As for bulk download and then review, I can see how that might be useful, but it isn't worth the effort, to me at least, but, patches are welcome.