View Single Post
Old 02-17-2011, 11:24 AM   #1
Doug-W
Member
Doug-W began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Feb 2011
Device: Nook
Merging books with same format

I had the case where I had a library with some books, and a second library with more books, some of which may have been an improved version of the books in the first library. I first imported the new books into the library, but now I had a series of duplicate books. In this case however, I knew that if there was a duplicate book, the original book had the most up to date meta data, and it was only the case where there were two books with the same title that I'd want to merge them and have the format of the newer book overwrite the format of the older book.

So an hour or so with database2.py, and I have this snippet:
calibre-debug

Code:
from calibre.library.database2 import LibraryDatabase2

db = LibraryDatabase2('/path/to/library/folder'); 

dupes = db.conn.get('select title from books group by title having count(*) > 1;')

for dupe in dupes:
    ids = db.conn.get('select id from books where title=? ORDER BY id DESC', (dupe))
    base_id = ids.pop();
    for id in ids:
        formats = db.conn.get('SELECT format from data where book=?', (id))
        for format in formats:
            f = db.format(id, format, index_is_id=True, as_file=False)
            if not f:
                continue
            stream = cStringIO.StringIO(f)
            db.add_format(base_id, format, stream, index_is_id=True,
                    path=tpath, notify=False)
            db.remove_format(id, format, index_is_id=True, commit=False)
        db.delete_book(id, commit=False)
db.conn.commit()
db.clean()
And, all duplicate books are merged, with formats overwriting one another. Hope that helps or that someone can come up with a better way of handling it
Doug-W is offline   Reply With Quote