View Single Post
Old 12-18-2015, 12:26 PM   #6
BookJunkieLI
Evangelist
BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.BookJunkieLI ought to be getting tired of karma fortunes by now.
 
BookJunkieLI's Avatar
 
Posts: 435
Karma: 572984
Join Date: Jan 2010
Location: Long Island
Device: Kobo Libra 2, Kindle 4, Nook Gl4, Nook STR, REB 1100, Ebookwise 1500,
There is no reason to get rid of the originals, especially if you haven't had a chance to check the converted copy. As others have pointed out things don't always go smoothly, plus they don't really take up a lot of space.

A few plugins I highly recommend installing if you haven't already:
Find Duplicates
Manage Series
Count Pages
Goodreads Metadata
EpubSplit - if you have a lot of box sets/omnibuses and prefer individual books

Virtual Libraries are your best friend.

Everyone has their preferred process for going through and updating/fixing their library - this is the one I've been using to fix my 10k library. YMMV

Spoiler:
First thing - Visually scan through the books to make sure Titles are in the title field, Authors are in the author field, and that the Title is an actual title and not a random alphanumeric code.

Run Find Duplicates.
-I check the results first for duplicates records that are just different formats ie EPUB vs MOBI vs PDF etc. Those I merge into one record.
-The remaining duplicates I like to run Count Pages for both word and page counts. This helps to easily weed out book samples vs complete books.
-After that I start opening and comparing each set of duplicates checking for which book has the better formatting/table of contents/cover, etc.

Next is the long, boring part
I have a Yes/No column called Done, a Yes/No column called Fix, and a tag-like column called Original Tags. My "editing" tags are Round 1/2/3, Boxed Set - Split Up, Check Quality, Poor Quality, Poor Formatting, Needs TOC,Check Series, Update Cover, Update Tags, Complete.
I have a Virtual Library called Not Done where the setting is #done:false so if a book hasn't been marked Yes/No in the Done column it will appear in my library.

My personal problem children as far as Poor Quality/Formatting goes are usually books where the only format is one of the following: PDF, DOC, RTF, TXT. These are all automatically marked Fix:Yes and Done:No. A lot of these were acquired before the Major Publishers had started converting over their backlists to ebooks so by the time I get around to being ready to 'fix' them I may have purchased a proper copy and can just delete them.

Next I begin the process of downloading metadata. I get the best results with Goodreads, you might prefer a different source.
I try to download only 10 at a time. If I'm feeling particularly impatient I may go up to 25 at a time but I really try to limit myself as it's only polite to not inundate the online databases with requests.
Plus as many books as you download metadata for at a time you then need to verify that data before accepting. You should NEVER EVER blindly accept that the metadata downloaded is correct without verifying it first. Out of every 100 books close to a quarter of them are matched to the wrong book entirely, the wrong language edition, or the wrong cover. ALWAYS check your data before updating.
Once I've verified the data all I do the following:
Good Metadata Downloads:
-move tags to Original Tags
-clear tags and add Round 1/2/3, Check Quality and Boxed Set-Split Up(if applicable)
-mark Done column as No
Bad Metadata Downloads:
depending on my mood I either begin the process of figuring out why it didn't download/grabbed the wrong metadata or I mark the Fix column as Yes and ignore it for the time being.

I'll then run Find Duplicates again just to check.

Once Round 1 is complete everything with the tag Round 1 will have the Done column switched back to Undecided.
I then begin checking the book quality looking at formatting, text quality, if it has a table of contents, if it has a decent cover.
If it matches all my criteria I'll either clear Tags, put in my preferred book tags including Complete, and mark it Done:Yes or I'll change the Tags to Round 2, update the tags to what needs to be fixed and mark it Done:No.
If it doesn't match my preferred quality criteria I'll update the tags to indicate what needs to be fixed, add Round 2, then mark it Done:No.

Round 3 I'll do the easy ones like Update Cover or Add TOC first, then decide if I care enough about the book to fix formatting issues or see if I can find a better quality copy for a decent price.

Of course if there's a book I want to read immediately I'll bypass the entire process to fix that one right away and upload it to my reader.

As I said others have different processes but this is what currently works for me. I hope this at least gives you an idea of how you want to proceed.


You mentioned wanting to separate your Fiction and Non-Fiction libraries. I would probably wait until you have a batch of fixed/updated Fiction books completed before moving them into a separate library and then onto a computer other than your laptop. This way your family is only accessing the 'complete' books and not possibly interfering with your updating/processing method.
BookJunkieLI is offline   Reply With Quote