10-09-2011, 04:31 AM | #151 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Choosing "Similar Author", "Ignore Title" will find that combination. Similar Author will find permutations of the author names transposed like you have above.
None of the "Title" matching options would directly match that title, hence why "Ignore Title". However the Quality Check plugin has a "Check titles with series" option which would flag the first book of "Cotton Malone 06 - The Emporor's Tomb" and so you could fix that title up. Which would then mean a "Similar Author", "Similar Title" (or Identical Title) type search could be used. |
10-09-2011, 09:32 AM | #152 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Ok, but if the version I want to keep is the "Cotton Malone 06" version, I'm SOL. Well, not SOL... i just have to manually check for duplicates myself the way I did before I came across the Find Duplicates plug-in. Yay or nay?
|
Advert | |
|
10-09-2011, 10:27 AM | #153 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Nay. Read the first line of my post again.
And keeping your titles that way is *not* the recommended practise with calibre - use the series column for it's intended purpose. If you want to control how titles are sorted on your device/when exported, look into metadata plugboards and save to disk templates. |
10-09-2011, 09:04 PM | #154 | ||
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Quote:
Wait, that's not right. Check me if I'm wrong, but if ignore the title, then Find Duplicates will flag everything by "Steve Berry" and by "Berry, Steve", which is essentially the same as my doing a search for "authors:Berry". And in the event that the author fields were both "Berry, Steve" ignoring the title and searching for similar authors via FD would omit the "Berry, Steve" results entirely. Don't get me wrong, I've gotten loads of use out of Find Duplicates (and no doubt will continue to do so). My question was fundamentally whether there was a way for the plug-in to recognize the two example titles as duplicates. The answer to that appears to be "No." Such is life. As I say, Find Duplicates is meeting 95% of my library management needs, so I'm really and truly not bitching. I just needed confirmation that my inability to do that particular search was because it was impossible, not because I was merely doing it wrong. Quote:
|
||
10-10-2011, 03:28 AM | #155 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Your original question was what search would bring those two books back, to which the answer is Similar Author, Ignore Title. Yes that will bring other books back for the author combinations as well depending on how many other books they have, however the recommendation I put with this plugin as per the first post is that you use it with a search like this first to find your author duplicates, and then find your title duplicates.
In this particular case, having a title that is prefixed with series information means that a more granular match on title only cannot be made. If the differences between the titles was some kind of suffix, then it could likely be matched with a fuzzy type search or something. You asked if you would be no better off than without using the plugin - I would argue that using the plugin to identify the author duplicates and hence present a subset of your library to you in that group to visually scan is way better off than scanning without it. However as I suggested if you follow the crowd in stripping series information out such as by using Quality Check to identify them that issue goes away. As to why not to put series information in your titles - there are certainly multiple areas of Calibre which will "break" or work less than ideally if you do. One such example is the Download Metadata function, as the download metadata plugins will do a comparison between the title of the book in Calibre with the search result from the website to verify they are the same book (unless matched by ISBN). Duplicate detection at the time you add books won't work. Numerous other plugins will also not work very well - things like the Search the Internet and as you have found Find Duplicates for a start without thinking too hard. There could well be other parts of Calibre affected by having extra long title names etc. Renumbering a series? Well that's going to involve renaming the title for you, which means a different file between the Kindle and calibre which means a mess of duplicates on your device. It is just generally a bad idea, you are working against Calibre, not with it. The series column is specifically there for this purpose. The only reason I know of people putting series information in the titles is because they did it thinking they had to as the way to get them sorted how they like either within Calibre or when exported out. In both scenarios there are better ways to handle this. To sort within Calibre, you can do a sort on multiple columns (the View Manager plugin can help with this). To assist sorting when stored on a device, if the device uses metadata rather than filenames for sorting (like a Kindle) then use a metadata plugboard, there is a sticky in the forums for how to set it up. Do it once, then forget about it. There are no valid reasons I have seen for not stripping series out - beyond users just not being aware of how to otherwise manage their sorting on the device. And that is to be expected by the way - I wouldn't see Calibre as the most intuitive or easy to use in that area, users will have to come hunting in the forums to figure that part out. Last edited by kiwidude; 10-10-2011 at 03:33 AM. |
Advert | |
|
10-10-2011, 09:14 PM | #156 |
Addict
Posts: 320
Karma: 56788
Join Date: Jun 2011
Device: Kindle
|
Maybe if your answers weren't so knowledgeable, I wouldn't ask so many questions...
As you rightly guessed, my reason for defining the titles as I do is a mix of both sorting within and without Calibre. The download metadata function doesn't give me too much trouble because I usually wait until after i've downloaded it to start messing around with the title display. Nonetheless, its a method that i concede is less than fully efficient, particularly to the degree it interferes with some of the most useful plugins. I will look into both metadata management tools that you mention. As always, thank you for your thoughtful insight. |
10-11-2011, 04:07 AM | #157 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Glad you found the info useful. Good luck with your renaming. You will find the bulk search/replace feature useful for this with regular expressions - ask in the library mgmt forum (or search - it has been asked before) if you need help with that.
|
11-06-2011, 01:28 PM | #158 |
Member
Posts: 12
Karma: 10
Join Date: Dec 2010
Device: Nook Color; Axim X51v; Cruz Reader R101
|
Compare Two Libraries For Duplicates
I will accumulate a modest number of books and then I add them to a temporary library. I work on the metadata in this small setting. I find it much easier than doing it in my main library. After I am satisfied, I copy them (with delete) to the main library. To this end, I would find it handy to find duplicates before adding to the main library. IOW, I would like to compare two libraries for duplicates. Would there be any chance of the plugin being modified to do this?
Is this a boneheaded way to work? Am I missing something about how Calibre should be used? It's worked well for me so far. It allows bulk metadata edits in a manageable setting. w |
11-06-2011, 02:12 PM | #159 | |
Enthusiast
Posts: 26
Karma: 22
Join Date: May 2011
Device: Kindle 3
|
Quote:
When adding books to a library, Calibre already does a rudimentary duplicate check on the title. I would love to see this one extended to functions like merge with existing etc. I posted on the bugtracker here: https://bugs.launchpad.net/calibre/+bug/869506 And in addition have it activated in the copy function as well. As far as I have understood from various other posts, Calibre can only work on one library but having the "Find Duplicates" plug-in extended to work across libraries would be a fantastic alternative. Kind regards |
|
11-06-2011, 09:44 PM | #160 |
Groupie
Posts: 156
Karma: 10001
Join Date: Feb 2011
Device: sony
|
@windom
That's pretty much my workflow too, and I find it easy to work with the one library at a time limitation: When I don't think a batch has many books that will duplicate/replace existing ones, I fix up the metadata, etc and move them into the main library as you described. When I think I have many duplicates, and don't want to waste my time unnecessarily (re)processing duplicates, here's what I do: Clean up the authors & titles, and run Extract ISBN & Count Pages (words). Then I move them to the main library and run Find Duplicates. Then I can make pretty good decisions about what to do with the new duplicates -- junk them, replace the old copy & preserve the metadata (merge them), or send them back to the templib for more processing, along with the non-duplicate new books. It's pretty easy to sort the library by Date [added] to identify the books I just added, and now want to move back to the templib for further processing. Having said that, I would certainly use a multi-library duplicate finder if it existed -- but I'm pretty comfortable with the current setup and adding multi-library capability seems like a lot of work. |
11-07-2011, 06:29 PM | #161 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@windom - as the others have said, nothing "bone-headed" about it. People use calibre in lots of different ways, whatever works for you. My own workflow involves multiple databases, but as I do my cleanup in the "clean" database and it is one author at a time I'm never in a situation of having duplicates between libraries.
My initial repsonse to this request a few months would have been "it isn't technically possible" under the understanding that calibre is a "one library at a time" system. However I know Kovid responded a few weeks ago to another thread in these forums discussing cross library database access stuff (perhaps in the Development forum?) where he indicated actually it is "possible", at least for some functionality. However given that Kovid is still working on replacing the back-end layers of calibre and deprecating some of the API along the way I have no intentino of creating even more work for myself down the line by looking into this now. Perhaps when that is released I (or someone else) can revisit it to see what would be involved/possible. |
11-07-2011, 09:51 PM | #162 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
To clarify: The calibre user interface is one db at a time. This means the GUI, the content server, the Tag Browser, the search etc are all designed to operate on a single db at a time. The actual database code makes a couple of minor assumptions (global variables) about there being only one db per process, but these are easily worked around, as is done in the case of the copy to library function.
|
11-11-2011, 10:00 AM | #163 |
Vox calibre
Posts: 412
Karma: 1175230
Join Date: Jan 2009
Device: Sony reader prs700, kobo
|
Hi, just a nitpicking thing, but should probably be corrected. In the duplicate search type, if one chooses, ISBN or Binary the columns below still have the heading title matching and author matching which should be changed to a single column of ISBN matching for ISBN duplicate search type and to Binary matching for Binary duplicate search type
|
11-11-2011, 02:43 PM | #164 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Hi Krittika,
I'm not quite sure what you mean - all the radio button contents are disabled? It is a limitation of QT that it has no ability to "disable" a GroupBox which is the headings I think you refer to? One thing I could do is make those Title/Author group box sections invisible when you select ISBN or Binary. It makes the content jump around a bit but if people see that as less confusing then it is an easy change. |
11-11-2011, 09:40 PM | #165 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@kiwidude: I'd suggest hiding. You can set it up so it doesn't cause a jump in the content, by also inserting/removing a spacer when hiding.
|
Tags |
cross library duplicates, in library duplicates |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1184 | 04-17-2024 06:17 PM |
[GUI Plugin] View Manager | kiwidude | Plugins | 414 | 04-13-2024 01:41 PM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 08:39 AM |
[GUI Plugin] Generate Cover | kiwidude | Plugins | 811 | 03-16-2024 11:31 PM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |