Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-18-2013, 10:03 AM   #1
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 500
Karma: 1500000
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
Cleaning my library

Is there anything that can scan through my library and work out
  • where the author and title are swapped over
  • where there might be comments after the book title in brackets
  • where the title and author appear in the same field
  • where I have a million different copies of the same book
  • etc
Cheers

Mike
mike_bike_kite is offline   Reply With Quote
Old 06-18-2013, 10:16 AM   #2
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
  • where the author and title are swapped over <- human eyeball, edit metadata, swap.
  • where there might be comments after the book title in brackets <- bulk search/replace
  • where the title and author appear in the same field <- human eyeball
  • where I have a million different copies of the same book <- find duplicates plugin
unboggling is offline   Reply With Quote
Advert
Old 06-18-2013, 10:24 AM   #3
Rbneader
Fanatic
Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.Rbneader ought to be getting tired of karma fortunes by now.
 
Posts: 500
Karma: 2661351
Join Date: Mar 2012
Device: None
The Find Duplicates plugin is excellent and I highly recommend it.

For the swapped title and author, reading through the library is probably the best bet unfortunately. I wonder if there is some way to sort on title length (probably an indicator of title and author in the same field)?

I'm going through and cleaning up my libraries right now too so I feel your pain.
Rbneader is offline   Reply With Quote
Old 06-18-2013, 10:25 AM   #4
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,736
Karma: 26785668
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Thinkpad E595, Ubuntu Mate, Huawei Mediapad 5, Bouye Likebook Plus
You are in luck!

There is a thing called "user" that is very helpful in these circumstances. It has amazing pattern recognition and ability to distinguish whether a string of characters is the name of an author, a book title or something else. And the "user" can even fix problems! Usually you find the "user" mounted on the chair in front of your computer.

There are some plugin that can be helpful and let the "user" work slightly more efficient.

Quality check.
Find duplicates.
Extract ISBN.

Extract ISBN is interesting. I haven't tried it, but I assume that you can let it find ISBN and after that easily download metadata for that exact book, no guessing involved.
Adoby is offline   Reply With Quote
Old 06-18-2013, 10:32 AM   #5
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,020
Karma: 13471689
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by Adoby View Post
Extract ISBN is interesting. I haven't tried it, but I assume that you can let it find ISBN and after that easily download metadata for that exact book, no guessing involved.
Yep. It digs through the ebook(s) in question for numbers that match an ISBN format and updates your metadata with them. In my experience it finds ISBNs in about 50% of the ISBN-less ebooks I've used it on.

Very useful (because, as you say, it makes 'Download Metadata' lots more accurate if you have an ISBN).
mbovenka is offline   Reply With Quote
Advert
Old 06-18-2013, 10:46 AM   #6
gabby98
Wizard
gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.gabby98 ought to be getting tired of karma fortunes by now.
 
gabby98's Avatar
 
Posts: 1,751
Karma: 2667650
Join Date: Jul 2012
Device: Android, Nook Simple Touch, Nook Color, ..., Glo
Quote:
Originally Posted by Rbneader View Post
The Find Duplicates plugin is excellent and I highly recommend it.

For the swapped title and author, reading through the library is probably the best bet unfortunately. I wonder if there is some way to sort on title length (probably an indicator of title and author in the same field)?

I'm going through and cleaning up my libraries right now too so I feel your pain.
You can temporarily create a custom column that contains the length of the title (or author) and maybe use that to get you started....I used this to determine which of my entries may or may not need updating with the original description.

When cleaning up my library I also added a custom check column "checked", so that when I know something was all set I set it to yes. That way I would not waste time rechecking that column when I came back to resume working...

I love the find duplicates plugin!
gabby98 is offline   Reply With Quote
Old 06-18-2013, 10:48 AM   #7
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 500
Karma: 1500000
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
With a 100 books that might be possible but unfortunately I have a few more than that

I'd of thought it would be straightforward (perhaps not simple but at least straightforward) to write something that would see if the title field appears as a valid author field. I guess if a book has a rating then it's a valid author. You'd also have to be careful of combinations like "Richard Dawkins", "Dawkins, Richard", "R Dawkins" etc.

I guess there are lot's of common patterns that can be removed ie (YYYY) etc.

It seems like this is something that could be automated. Obviously the program wouldn't have access to a complete book database but it might be enough just to look at other books the user has.

Some books also have the series name shoved in there at random positions which might be hard to spot by program.

Is there an accessible SQL database that holds the users library? I'd be happy to write the SQL to cover these sorts of issues and then perhaps someone more knowledgeable with Calibre could add it into a small option.

Mike

Last edited by mike_bike_kite; 06-18-2013 at 10:59 AM.
mike_bike_kite is offline   Reply With Quote
Old 06-18-2013, 11:21 AM   #8
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,736
Karma: 26785668
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Thinkpad E595, Ubuntu Mate, Huawei Mediapad 5, Bouye Likebook Plus
Metadata.db is a sqlite 3 database.

http://www.sqlite.org/

A "cleanup" plugin would be most welcome!
Adoby is offline   Reply With Quote
Old 06-18-2013, 12:05 PM   #9
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 500
Karma: 1500000
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
Well that wasn't too hard. I've installed SQLite, I found the Calibre database and I'm now trying out some queries. It might take me a week or so to learn SQLite and write the queries etc but it seems doable.

If I change the database, say perhaps swap a book title and an author name then will this show up cleanly in Calibre or do I need to do more? Can I just copy the metadata.db file to keep a backup of everything?

Mike
mike_bike_kite is offline   Reply With Quote
Old 06-18-2013, 12:17 PM   #10
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 500
Karma: 1500000
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
Actually here's my first success
Code:
select   b.title, a.name
from    books b, books_authors_link al, authors a
where  al.book = b.id
             and a.id = al.author
             and b.id not in ( select book from books_ratings_link )
            and b.title in (
                      SELECT  distinct name
                     FROM     books_ratings_link rl, books_authors_link  al, authors a
                     where      al.book = rl.book
                                      and a.id = al.author
                    )
It shows all the entries where the author and title are definitely swapped over ie the book title appears as an author in an entry where that entry has a rating etc. This obviously doesn't do anything clever where the titles and author names are a little messed up but it's a good first step. This found nearly 500 matches in my little database!

Mike
mike_bike_kite is offline   Reply With Quote
Old 06-18-2013, 12:36 PM   #11
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,736
Karma: 26785668
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Thinkpad E595, Ubuntu Mate, Huawei Mediapad 5, Bouye Likebook Plus
Yes, just copy metadata.db to make a backup.

The books are stored in folders with paths that are constructed from the author and book title, so it is NOT enough to just swap values. When you update authors or title for a book in calibre, the folder for that book is updated as well. That is why a plugin might be better, it would allow moving the books as well.

I think it would be better to update some custom column with suggested action. Perhaps several custom columns with suggested new values for title, author and series. Then that can be reviewed and updated in calibre.

It could also be possible to make a script that creates a list of actions like swap author/title, and then use the command line tool calibredb to execute those changes from the same script that finds the actions to perform.

http://manual.calibre-ebook.com/cli/calibredb.html

Last edited by Adoby; 06-18-2013 at 12:47 PM.
Adoby is offline   Reply With Quote
Old 06-18-2013, 12:38 PM   #12
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
Quote:
Originally Posted by mike_bike_kite View Post
Well that wasn't too hard. I've installed SQLite, I found the Calibre database and I'm now trying out some queries. It might take me a week or so to learn SQLite and write the queries etc but it seems doable.

If I change the database, say perhaps swap a book title and an author name then will this show up cleanly in Calibre or do I need to do more? Can I just copy the metadata.db file to keep a backup of everything?

Mike
I think you can only use this to identify the titles that need changing, but still need to do the changes through the Calibre GUI. The folder/filenames used to store the eBooks are constructed from the author/title fields so changing them only in the database is likely to mean that Calibre loses track of the actual underlying eBook files.
itimpi is offline   Reply With Quote
Old 06-19-2013, 06:21 AM   #13
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,610
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@mike_bike_kite - I've done something similar as yourself - interrogated the database to identify potential 'problems', I used the output to create a .csv containing author and title

Then I pushed the csv into the Import List plugin, amongst other things the plugin can creating a Reading List (another plug in). So I ended up with Reading Lists like SwapAuthorTitle, RubbishInTitle etc.

Then I did the changes using standard Calibre facilities - many of which can be done in bulk - eg swap author & title. This addresses the issues Adoby & itimpi have raised - of ensuring the database & folders remain in synch

I find this technique quite useful - I do something similar with the results from Windows and Spotlight content searches

BR

Last edited by BetterRed; 06-19-2013 at 06:43 AM.
BetterRed is offline   Reply With Quote
Old 06-19-2013, 06:40 AM   #14
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
comments after the book title in brackets
should be easy to detect with a bit of regex, from within calibre ? Find title strings which contain (
cybmole is offline   Reply With Quote
Old 06-19-2013, 01:25 PM   #15
mike_bike_kite
Digitally confused
mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.mike_bike_kite ought to be getting tired of karma fortunes by now.
 
mike_bike_kite's Avatar
 
Posts: 500
Karma: 1500000
Join Date: Mar 2010
Location: London, UK
Device: KPW, K2i, Nexus 7 32gb, Kobo Mini
At the moment I'm just writing a SQL script that looks for common issues and outputs the book's id, and the new title and author. It's progressing quite well as roughly half the books that couldn't be found are now found (we're talking 1000's).

Regexp : I can't quite see how to use regexp with sqlite - it seems I have to write it myself which seems a bit tough. This is a shame as I could write some quite fancy checks that might clear another few 100 items.

When I'm finished I'll just pass on the script and hopefully someone else will be able to put it into an add-on etc.
mike_bike_kite is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Question about Cleaning Up Riter Library Management 8 05-13-2013 10:22 AM
Cleaning and organizing library, adding covers that don't show up, etc... Mikey1969 Calibre 6 11-26-2012 09:46 AM
cleaning a K3 cathie Amazon Kindle 3 06-23-2012 07:21 PM
Cleaning up calibre's library & covers dimitri Calibre 3 06-20-2010 10:45 PM


All times are GMT -4. The time now is 05:19 AM.


MobileRead.com is a privately owned, operated and funded community.