Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 06-21-2020, 11:05 PM   #1
kjdavies
Zealot
kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.
 
Posts: 109
Karma: 53342
Join Date: Jun 2013
Device: Sony PRS-600
calibredb very slow after large metadata edit

Hi All,

I've found lately that when I edit a large number (thousands, possibly hundreds) of titles in my library, the next add I do via calibredb can take a long time.

Usually adding a new title (which consists of adding the book with one calibredb call and updating a bunch of metadata fields with a second) takes about 10-12 seconds on a 30,000-title library.

After a large edit, I've seen it take about five minutes to add the first title after this, then go back to the 10-12 seconds per title.

Today, I I updated a few thousand titles: I removed my 'new book' tag, and used the regex engine to copy part of the title into a custom field. No change to author or title... and calibredb ran for two hours on the first title I tried to add after that, before I killed the job.

I can add a title to another library using calibredb and it behaves normally (10-12 seconds... it's another 30,000-title library). I can add a title via the calibre GUI and it takes just a few seconds (but of course doesn't have all the metadata the script is setting). It's only when trying to use calibredb to add a new title to a library that has had a large metadata change that I see this.

Addition: if I move the target library (rename the folder) and create a new library with the same name and custom fields, I can load into that one normally (each title takes about 8-9 seconds, with the two calibredb calls). Moving the titles from one library to the other takes very small time (2-3 seconds?).

When I switch the libraries back, the load time becomes immeasurable again. I do hear a lot of disk activity while calibredb is running, but it seems it never gets past the first file. (And yes, I did try removing that file -- in fact, it loaded via the GUI just fine, and the next file could not be loaded.)

Any thoughts?

Last edited by kjdavies; 06-21-2020 at 11:18 PM. Reason: added more information
kjdavies is offline   Reply With Quote
Old 06-22-2020, 06:23 AM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,587
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
The following is: a) speculation, and b) is from the perspective of the GUI, but as you no doubt realise there's a lot of code sharing between the GUI library manager and the command line tools.

When a book's metadata is created/updated the new data is written to the relevant database tables and and to the metadata.opf file in the book folder. If the changes are made via the metadata edit dialogues the writes to the database tables and the write of the metadata.opf file** is done when the OK button is pressed.

But, if you edit in the book list using the F2 key and Tab I'm pretty sure the writes are done for each cell edited - so if you press F2 on Rating, enter a value and press Tab, the books rating value is written to the database and a fresh metadata.opf file is written; then if you enter some Tags and press Tab, the Tag values are written to the database and a fresh metadata.opf file is written… so on and so forth. I'm wondering if the calibredb set-metadata command works in a similar way.

** metadata.opf writes

Metadata value alterations can cascade though many books, e.g. a spelling correction of 'Vapmires' to Vampires' will generate one write to the database, but every book tagged with 'Vapmires' will get a fresh metadata.opf file with the correct tag value - i.e Vampires'.

To avoid long delays waiting for many fresh metadata.opf files to be written, fresh metadata.opf files are always written in a 'slow' background thread (some call them 'lazy writes'). On my rig they chug along at about one a second.

IIRC the calibredb backup_metadata command (without the '--all' option) will flush through any outstanding writes in the queue - might be worth giving it a try.

BR

Last edited by BetterRed; 06-22-2020 at 06:42 AM. Reason: correct the command name
BetterRed is online now   Reply With Quote
Advert
Old 06-22-2020, 06:33 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Metadata backups to opfs are not done by calibredb. Only the gui. Unless yo urun

calibredb backup_metadata
kovidgoyal is online now   Reply With Quote
Old 06-22-2020, 06:35 AM   #4
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,018
Karma: 13471689
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by BetterRed View Post
IIRC the calibredb embed_metadata command (without the '--all' option) will flush through any outstanding writes in the queue - might be worth giving it a try.
That's 'backup_metadata'

Edit: ninja-ed bij the man himself
mbovenka is offline   Reply With Quote
Old 06-22-2020, 07:01 AM   #5
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,587
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I swiped the wrong command name, I've corrected it now. But if the metadata.opf files are not being progressively updated what else could be slowing down the OPs updates?

@kjdavies - have you tried compressing/vacuuming the library databases, there's a tool in Jobspy/Utilities that will do all libraries that calibre knows about except the current one, so I run it from an empty library.

BR
BetterRed is online now   Reply With Quote
Advert
Old 06-22-2020, 08:00 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,866
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The OP has various weird performance things on in general that I cant replicate. See his other thread.
kovidgoyal is online now   Reply With Quote
Old 06-22-2020, 10:39 AM   #7
DaltonST
Deviser
DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.DaltonST ought to be getting tired of karma fortunes by now.
 
DaltonST's Avatar
 
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
I believe that differences in RAM utilization and metadata.db file size account for much of what you describe, making it fruitless for others to try to replicate your issues.

Monitor your RAM usage while you do your various tasks. Also monitor the uncompressed size of your metadata.db files.

Optimize the size of your RAM page-file, and defrag your HD.

When you quick-switch from Library A to Library B, Calibre closes A and deletes its cache from RAM. It then loads B into its cache in RAM.

Depending on the size of metadata.db of A compared to B, you could easily notice a difference in initial loading times. Quick-switching to a Library of 40,000 books will always take longer than to a Library of 100 books.

Using calibredb when the GUI is closed obviously uses much less RAM than doing the identical function when the full GUI is open.

When you create a new Library by copying the current Library's structure, and then switch to it, that first switch will be very fast since the new Library is initially empty.

If you are switching among your Libraries just to "look" to see what changed, and not to do any edits or add new books while in the GUI, you might consider running CalibreSpy via a CLI command file (not the Calibre GUI) using calibredb, especially the command file that uses a command-line parameter that invokes a "pre-filter" option so it loads exactly what you are interested in, and not everything. Super-fast. And you can simultaneously view as many Libraries as you wish, always read-only. CS's OP has several flavors of .bat files attached as CLI templates for non-Windows systems.




DaltonST

Last edited by DaltonST; 06-22-2020 at 10:41 AM.
DaltonST is offline   Reply With Quote
Old 06-23-2020, 01:33 AM   #8
kjdavies
Zealot
kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.
 
Posts: 109
Karma: 53342
Join Date: Jun 2013
Device: Sony PRS-600
Quote:
Originally Posted by BetterRed View Post
The following is: a) speculation, and b) is from the perspective of the GUI, but as you no doubt realise there's a lot of code sharing between the GUI library manager and the command line tools.

When a book's metadata is created/updated the new data is written to the relevant database tables and and to the metadata.opf file in the book folder. If the changes are made via the metadata edit dialogues the writes to the database tables and the write of the metadata.opf file** is done when the OK button is pressed.

But, if you edit in the book list using the F2 key and Tab I'm pretty sure the writes are done for each cell edited - so if you press F2 on Rating, enter a value and press Tab, the books rating value is written to the database and a fresh metadata.opf file is written; then if you enter some Tags and press Tab, the Tag values are written to the database and a fresh metadata.opf file is written… so on and so forth. I'm wondering if the calibredb set-metadata command works in a similar way.

** metadata.opf writes

Metadata value alterations can cascade though many books, e.g. a spelling correction of 'Vapmires' to Vampires' will generate one write to the database, but every book tagged with 'Vapmires' will get a fresh metadata.opf file with the correct tag value - i.e Vampires'.

To avoid long delays waiting for many fresh metadata.opf files to be written, fresh metadata.opf files are always written in a 'slow' background thread (some call them 'lazy writes'). On my rig they chug along at about one a second.

IIRC the calibredb backup_metadata command (without the '--all' option) will flush through any outstanding writes in the queue - might be worth giving it a try.

BR
I've seen that, regarding individual cell edits. I did wonder that it took so long to commit changes, but having to rewrite the OPF... well, I would expect it wouldn't take long, but it would be non-zero time. It's clearly not as efficient as opening the metadata editor window and updating a bunch of fields at once.

What I'm referring to, though... I've got calibre configured to add a 'new book' tag to each title as it's added. I use this during batch loads so I can see the new entries easily, without having to remember the last time I cleansed my input. I clean the new books up, in this case assigning a 'characters' custom field. I update a few hundred books in a session (because I'm like that), remove the 'new book' tag, and close calibre.

I can reopen calibre and add new books, and it behaves normally.

When I try to add a new book via calibredb, it can take several minutes (in one case a couple hours) to get the first one into that library, and then it behaves normally. If I use calibredb to add the new title to a library where I have not done this, it behaves normally.

As best I can tell it seems to be trying to adapt to the large volume of changes. I have no explanation as to why this is so, and it baffles me. I can work around it, of course, but if there's something simple I can do to prevent it I'd be happy to learn it.
kjdavies is offline   Reply With Quote
Old 06-23-2020, 01:39 AM   #9
kjdavies
Zealot
kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.
 
Posts: 109
Karma: 53342
Join Date: Jun 2013
Device: Sony PRS-600
Quote:
Originally Posted by kovidgoyal View Post
The OP has various weird performance things on in general that I cant replicate. See his other thread.
I am the king of weird questions. In my day job I called one of my service providers and he told me he loves my calls. "Either it's something simple that you know about and don't have access to[1], or I get to learn something because you find the weirdest cases."

[1] in this case, online credit card processing. I know (knew; I left that job three years ago) about each of the merchant account settings, from years of working with them, getting them set up, and testing them... but I don't have access to all of the configuration settings. So I'd call and ask for specific values for settings that most people don't know about.

One time I called up and asked for a specific configuration change that the support tech didn't even know about. "Okay, ask [your supervisor] about it and he can show you how." I got call back from the supervisor a short time later. "Dude, can you not do that? you're freaking out my staff."
kjdavies is offline   Reply With Quote
Old 06-23-2020, 01:41 AM   #10
kjdavies
Zealot
kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.
 
Posts: 109
Karma: 53342
Join Date: Jun 2013
Device: Sony PRS-600
Quote:
Originally Posted by BetterRed View Post
I swiped the wrong command name, I've corrected it now. But if the metadata.opf files are not being progressively updated what else could be slowing down the OPs updates?

@kjdavies - have you tried compressing/vacuuming the library databases, there's a tool in Jobspy/Utilities that will do all libraries that calibre knows about except the current one, so I run it from an empty library.

BR
I have not tried compressing/vacuuming the databases. I had the thought that calibredb might have been doing something like that -- first run on a library that just had a big change realizes there is a lot of changed data, and cleans it up. I didn't know that was so, or how to find out.

I'll give it a try. It's probably a good practice anyway.
kjdavies is offline   Reply With Quote
Old 06-23-2020, 01:48 AM   #11
kjdavies
Zealot
kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.kjdavies is no e-book dilettante.
 
Posts: 109
Karma: 53342
Join Date: Jun 2013
Device: Sony PRS-600
Quote:
Originally Posted by DaltonST View Post
I believe that differences in RAM utilization and metadata.db file size account for much of what you describe, making it fruitless for others to try to replicate your issues.

Monitor your RAM usage while you do your various tasks. Also monitor the uncompressed size of your metadata.db files.

Optimize the size of your RAM page-file, and defrag your HD.

When you quick-switch from Library A to Library B, Calibre closes A and deletes its cache from RAM. It then loads B into its cache in RAM.

Depending on the size of metadata.db of A compared to B, you could easily notice a difference in initial loading times. Quick-switching to a Library of 40,000 books will always take longer than to a Library of 100 books.

Using calibredb when the GUI is closed obviously uses much less RAM than doing the identical function when the full GUI is open.

When you create a new Library by copying the current Library's structure, and then switch to it, that first switch will be very fast since the new Library is initially empty.

If you are switching among your Libraries just to "look" to see what changed, and not to do any edits or add new books while in the GUI, you might consider running CalibreSpy via a CLI command file (not the Calibre GUI) using calibredb, especially the command file that uses a command-line parameter that invokes a "pre-filter" option so it loads exactly what you are interested in, and not everything. Super-fast. And you can simultaneously view as many Libraries as you wish, always read-only. CS's OP has several flavors of .bat files attached as CLI templates for non-Windows systems.

DaltonST
Most of these suggestions would make for general improvement in performance, and I'll take them under advisement.

I expect some degree of slow processing because of how I'm using calibredb. When I get the cycles I'm tempted to see if I can create more specific tools to make this faster (learn OPF and let calibre import the content and metadata via GUI, a plugin that can recognize my metadata files, a modified calibredb that can take the additional metadata fields I want and thus let me import multiple titles at once... many options, all of which would change my workflow).

What I'm looking at right here is a single use case where it takes an unusually long time to process, and then returns to normal. It's nothing I can't live with, but I would like to understand it.
kjdavies is offline   Reply With Quote
Reply

Tags
calibredb, performance


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Metadata slow to save after edit (new user) Clairvaux Library Management 5 12-29-2018 07:53 PM
Edit Title Field in Edit Metadata window goldilocks Library Management 7 11-08-2015 10:09 PM
Slow Edit metadata window z.nina Calibre 59 03-09-2014 08:15 PM
metadata edit very slow jeroencl Library Management 1 04-28-2012 11:51 AM
Slow calibredb add on huge ebooks directory ternyk Library Management 8 05-17-2011 04:39 AM


All times are GMT -4. The time now is 08:53 PM.


MobileRead.com is a privately owned, operated and funded community.