View Single Post
Old 08-29-2011, 06:10 AM   #162
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
@unboggling - re the metadata as you go thing. I will share one of my "failure" experiences here, which only applies to people who have a very large collection of books to begin with.

Like I am sure a lot of Calibre users my initial approach was to just throw all my years of books into my first Calibre library and then "clean as I go". However apart from performance issues when you have an enormous library (which subsequent Calibre releases have improved but not eliminated) it is a case of garbage in, garbage out.

If your initial load of Calibre is from a clean and structured source (say a Kindle using only books bought from Amazon) then there is no problem. However if like me you have numerous duplicate formats and editions from varying sources, I found it unmanageable to take the approach of "I will just clean it up as I go later". It became just too hard and "dangerous" in the adding to the library in terms of what to do about duplicates. If you have multiple of a format for the same book they could all be varying quality, done from different conversions of different sources etc. That is all assuming the file was correctly named as the right book in the first place! So either you have to make a decision at the time you import to Calibre as to which is the best format to have, or you import them all and end up with a mammoth library(slow) with a huge duplicate problem to resolve.

So what I ended up doing was almost starting again by having multiple libraries - the initial one that is the "partially clean mess", and my "clean one" that books get migrated into that is intended to contain only books that are cleaned up. Note that if I was installing Calibre for the first time today I would not have two libraries for this purpose - it is only the fact that I had already invested so many months of effort into partially cleaning the initial one that I have kept it.

So now what I do is just load books into my "real" Calibre library on an author by author basis. I do an indexed search in Windows Explorer of all my mish-mash source book folders which are under a common root folder, to find all books I have for an author, sorted by type. If I have already setup books for that author in my old library I use the Copy to Library (with Delete) to move into my "clean" one. Then for each book I update the metadata/cover and identify which is the best format of it I have, and if it isn't an ePub then (some PDF exceptions noted) I convert it to ePub and do any final cleanup. Then I delete every other format, in my source directories and from Calibre (my Calibre is of course backed up).

Now it didn't take very long to get enough books in my new library to ensure I would not be running out of things to read any time soon. Is every book I own in there? No. However I have just started with my most favourite/desired authors to read, and gradually chipping away at them every week. At some point I will get bored with doing this (or run out of time) as given the years of "to read" books I have already the likelihood of reading the rest is slim to none. However as an author name comes up mentioned by family or friend I can just repeat the exercise for that author to extract them from the source mess and add them to the clean library.

My so called "clean" library is however not perfect, it does have a backlog built up in it. For a prodigious author where the quality is so-so it takes a long time to do all the editing, and it is easy to get distracted into adding another "small" author you know you have the retail versions for. So I have a #done yes/no column I use with each book, which in combination with #retail gives me an easy way to see which books I still need some effort on. However at least I know that format is the best I have at that point for that book/author.

That was the balance that ended up working for me anyways. Like I said, if I was starting from scratch today I would just use the "author by author" gradual approach to add from windows explorer to Calibre. The risk for people who have multiple conflicting formats of a book is to try to add them all to Calibre - my advice is find your best format and only add that one in. You can associate the Calibre ebook viewer with the various file types like epub, mobi etc so it is just a double click to open them from the explorer search results.

Last edited by kiwidude; 08-29-2011 at 06:13 AM.
kiwidude is offline