|
|
#1 |
|
Connoisseur
![]() Posts: 73
Karma: 10
Join Date: Jun 2020
Device: Kobo Aura HD
|
An another structure of directories ....
Good evening, when Calibre has to manage a large number of books (several thousand, or even more), we end up with a root directory of the library containing a very large number of subdirectories. This results in a slowdown on the operating system.
What do you think of a slightly different organization where we would have one entry per ID, for example, 1000 IDs per directory (1-999, then 1000 to 1999, etc.)? This structure greatly facilitates additions without having to check the entire library and significantly speeds up OS access. Given that I'm using a read-only copy of the database via my own scripts, is it possible to update the metadata with this structure? For future updates, I would simply need to add the current directory with the largest IDs. Votre avis ? |
|
|
|
|
|
#2 |
|
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 50,734
Karma: 178402706
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Interesting. I did some testing on my desktop and I did not notice a noticeable slowdown until I reached greater than 80,000 subdirectories tested on Windows 11 OS Build 26220.7934 (NTFS), OpenSUSE Tumbleweed (btrfs) & Ubuntu 25.10 (ext4) and Arch (ZFS) mounted using ZFSBootMenu. Given that my main calibre library which current has ~17,210 books contains 51,624 files and 21,514 folders (I do not make heavy use of data folders otherwise those number would be higher).
Note that all those tests were done on virtual machines just to keep the playing field more or less level. What you want would require a major rewrite of calibre and make new libraries incompatible with other versions of calibre. Last edited by DNSB; 03-02-2026 at 04:37 PM. |
|
|
|
|
|
#3 |
|
Connoisseur
![]() Posts: 73
Karma: 10
Join Date: Jun 2020
Device: Kobo Aura HD
|
I have more books than you and i have 2 issues.
- Update new books between my windows source and my Nas Backup . - Os search very slow With this structure i can find and update easyly new ID et change some old updates with timestamp changed. I will test it and change metadata for un read only backup . But i think for a trigger on each new update or change . I don't know why nobody have issues whith very large libraries . Imagine . Just with 30000 books you have only 30 directories on first level . Imagine . You can place on a tablespace/filesystem in read only your old books Imagine . fast Synchronisation et no more chmod/chown on a lot of files .... Imagine . No more issues whith inodes numbers ... And with VM you have often large memory/cache . But on a simple Nas, it will be very slow, so if i can not change my old stuff ... And you already have the id in the directory name (number between () ). so it will be very easey to change/update/add and for a better sort , i will use some '0' before the ID . root_biblio/0002000/same_directories_than_before_for_id_1001_to_2000 Last edited by Doum; 03-02-2026 at 05:40 PM. |
|
|
|
|
|
#4 |
|
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 50,734
Karma: 178402706
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
And how many subdirectories do you have under that first level directory? For most of my testing, building the directory tree slowed down regardless of the subdirectory nesting level since the entire tree was scanned.
As for OS search? I seldom use that with calibre since, in general, I've found using the OS to access/modify a calibre library is a bad idea. I find searching from within calibre to be more flexible and faster. YMMV. Yes, I have used a structure with directories labelled 00-FF repeated 3 times when I stored the full path & filename in a database when I was still employed (one of my managers liked trying multiple approaches). My test build with all 16,843,008 (256/65536/16,777,216) created was extremely slow to access from the OS but fast within the application. OTOH, My Kobo ereaders store images in a 2 level structure numbered from 000 to 255 in each level though that path is, I think, derived from a hashing algorithm and not storing names in a database. |
|
|
|
|
|
#5 |
|
Chalut o/
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 663
Karma: 720026
Join Date: Dec 2017
Device: Kobo
|
IF we need to add a additional level of sub-directory (which is very unlikely to happen), I don't think using IDs is a good idea because it would unnecessarily fragment authors and make the library difficult to navigate.
Example: root_biblio/Isaac Asimov/Fondation (987)/ root_biblio/Isaac Asimov/Robot dreams (1234)/ The two books will moved respectively: root_biblio/000000/Isaac Asimov/Fondation (987)/ root_biblio/001000/Isaac Asimov/Robot dreams (1234)/ If you are trying to browse your library, you should browse each top level to see if it contains the authors you are looking for, and when you look at the author folder, you may not find the book you want because it is under a other top level sub-directory. Nah. Better to split and use the first letter of the author value, wich will result to: root_biblio/I/Isaac Asimov/Fondation (987)/ root_biblio/I/Isaac Asimov/Robot dreams (1234)/ This is much more intuitive to navigate. Yes, it limits to ~24 top folders, but that should simplify the directory structure enough. And at worst, we can use the first two letters instead of just one to create your top-level folders. @DNSB, honestly, I think that Calibre is resilient and well-built enough that it could handle a format change of the directory structure. It wouldn't even be a "major" change, I think that only one function need to be edit, and it would go smoothly. Calibre would keep the old structure for old books and update it gradually as metadata is modified for a book, without breaking anything (almost). The real problem would be the operation to restore the library in the event of DB corruption. This operation uses the directory structure to recreat the database, and ensured a good restoration whereas a library could contain two directory structures at the same time, would be a real hell. Else, regadless of this funny intellectual exercises, the current directory structure is well good enough. Except you have a lot of authors with a single book, the current format do a pretty decent job to reduce the number of top-level directory: In my library of 80 000 books, their is only 14 000 top-level/authors folders and it still load pretty quickly. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| I want to synchonize more than two Calibre directories with more than 10 directories | lexcafe | Library Management | 1 | 04-15-2018 01:07 PM |
| Odyssey directories | Bookripper | Bookeen | 21 | 11-30-2012 09:03 AM |
| double directories | sar026@gmail.com | Calibre | 2 | 10-02-2012 08:47 AM |
| PRS-600 Can somebody explain the directories? | Cue | Sony Reader | 5 | 01-19-2012 08:31 AM |
| [old thread] filename and library structure /author and titel structure | tscamera | Library Management | 4 | 05-31-2011 06:44 PM |