View Single Post
Old 01-24-2011, 09:52 AM   #17
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kiwidude View Post
If you do a search you will find plenty of other threads here discussing the problems, existing behaviour, workarounds, sql reporting etc. I won't rehash it all here.
Yes, there are lots of workarounds for specific issues and there is general agreement that improvements can be made. One problem is that this issue arises primarily during the initial import stage, when lots of books are being imported. Any developers in that first stage add fixes to solve teh wrost problems, but then move on to other things as their backlog of books to be added drops.

Quote:
I will say that Calibre does not match just on title - it is title and author, and there is a little bit of "fuzziness" in terms of things like leading "The" etc improving the match logic.
When the autosort/automerge option is on, you're correct (that's mostly my code), but when it's off, the initial post was right - Calibre looks only at title. In the autosort/automerge mode - ("similar author/title found" option in Prefs|Import/Export|Adding Books) it adds the first unique format found for any author/title combo, then skips any duplicate fuzzy matched author/title formats.

Quote:
However I 100% agree that if like me you turn the preference on so that all books added will merge automatically (which is what you want for new formats of a book to be the same record in Calibre), then it does NOT handle the situation of the same format being added very well. The existing dialog telling you "after the event" that it "merged something" without even telling you which format it threw away is as you say not very useful.
Smile when you say that pardnuh! That's my code! Actually, the notification is Kovid's code - mine was worse - it just dumped any duplicates with no warning. It was written to solve a very specific problem. I wanted to add my existing library and get one entry for each author/title book, and keep one of each format for that book. I really had no idea which of any duplicate formats was best, and insufficient time to look at every one. The problem was that the existing duplicate detection simply compared titles (and it still does if my autosort/automerge option is off). If titles matched, even for different formats and different authors, it asked if you wanted to add as separate entries. There wasn't even a manual merge function then. So adding the autosort/automerge option got me what I wanted - a structure of all unique author/title (fuzzy title matched) books where one of every format was kept for each book.

In my case, I usually had only one really good master format and most of my duplicates were converted from that master. The master format would always be added to Calibre.

My plan was to worry about the "best" format later. If I was unhappy with a format when I went to read it, I could look to see if I had a better one that was skipped during the import. Usually I would find one good master format in the record and could use Calibre's excellent conversion capabilities to get a copy that was even better than whatever had been skipped.

Quote:
It's on the list to be improved further I believe but as I understand it there are other priorities first.
Yes, but there's plenty of room for others to dive in and improve it now. For me, all of my original book library is in, and I now add only a few books a month. I don't need the bulk importing code that I spent so much time writing
Starson17 is offline   Reply With Quote