@Kovid: this library pushes at least three boundaries. The first is the number of books, 27,100. This isn't an excessive number of books, but it exacerbates problems 2 and 3. The second is that it contains composite columns that use composite columns, perhaps 3 and 4 deep. The third is that it uses the virtual_libraries() function and contains 40 virtual libraries, overflowing the LRU cache.
After some analysis I found that if we use get_proxy_metadata() in categories.py (line 143) instead of get_metadata(), if we up the search LRU cache up to 100 (could be less but more than 40), and if we streamline the search cache path in search._do_search then the library opens in 60 seconds on my machine. Without those changes it didn't open in the 5 minutes I waited before I ran out of patience. All of the changes were necessary to make the performance acceptable.
I have pushed a branch "perf_test" with the changes I made and made a pull request. This isn't a "real" pull request, but is instead showing what I did. I will abandon that branch once you decide what to do, if anything.
|