View Single Post
Old 08-04-2013, 08:49 AM   #4
marvin_2
Enthusiast
marvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura aboutmarvin_2 has a spectacular aura about
 
Posts: 25
Karma: 4472
Join Date: Jan 2011
Device: Kindle
Step-by-Step: The big steps are 1. Getting your books into Calibre and 2. Getting them cleaned up within Calibre.

Getting books into Calibre:
Importing them in small batches might work best, starting with authors that have a lot of books in your collection (e.g. anything resembling Tolkien or King). If you use a temporary library for importing, curating them will be a lot easier.
1. Fire up Calibre, create library (e.g. imports)
2. Experiment with metadata reading: Preferences-Adding books-read metadata from file (or not) - check which works best for your collections)
3. Add books (use the correct option - one book per directory or every file a different book)
4. Review author names and titles. Search for "isbn:false and comments:false" in the search bar, and extract isbns from books with no comments or isbn with isbn add-in)
(Optional: 4a. Remove/merge duplicates with the plugin as described by speakingtohe above)
5. Select all books with incorrect or missing metadata, ctrl+d, download metadata only
6. Review author names + titles again (covers give a clue if the downloaded metadata match the correct author/title
7. Create a new library (My Calibre Library)
8. Check that Preferences - Adding Books - Automerge is checked with the preffered option
8. Select all books, select "copy to library, then delete " from context menu

All books will now be copied from your import-library into your Calibre-Library, with duplicates merged into one record.

2. Cleaning up book data (optional)

What you could do:
- Re-download better covers where needed (Select books, Ctrl+D, download covers)
- Clean up metadata (Try the various "find duplicates" options)
- Clean up tags (right-click on the tags-tree on the left menu)
- Polish epub, azw3 (Polish books-plugin)
- Quality check (epub: manual with Sigil using open with - plugin, automatic with (Quality Check - plugin)

Two more points:
- The above procedure will keep one copy of each format for each of your books. If start with two epub-versions of the same book, it will keep only one, but not always always the one with the best quality. Best to keep your old archive for a while, and manually search for a better version if needed.
- Non-fiction or comic book files can be very large. If a small number of those make up a large part of your 20 GB archive, you might consider keeping them in a separate library, to make it easier keeping back ups etc.

Finally, take your time, and use a lot of back ups - some of the above can do a lot of damage accidentally.
marvin_2 is offline   Reply With Quote