01-01-2013, 04:10 PM | #361 |
Junior Member
Posts: 6
Karma: 10
Join Date: Jun 2005
|
Similar Stories won't do what I'm looking for at all. It's completely different.
It looks at a single item, creates an index, and looks for matching items, ranking every single item via a new column. That's completely unneeded, and counter productive. Duplicate Finder creates a hash table, the right way of doing this. The difference is creating a hash of the entire file, and creating a hash of only the items that matter and ignoring those that don't (the metadata.opf file, for example, guaranteed to be different in small ways if altered at all) If you aren't interested, fine. Kovid suggested your plugin, and I gave you (and him) the courtesy of asking here, despite that you clearly aren't interested. My original post (linked above) points to tools outside of Calibre that are useful for this sort of comparison, and frankly, I see the value of doing it from within Calibre (especially for importing large quantity of fresh books into an already large library, perhaps from completely public domain sources...) even if you still don't. I hope someone else out there finds the pointer useful, and I hope someone creates a plugin that will do this sort of Dupe checking (content based hashes, not file based hashes) one of these days. Last edited by sethcohn; 01-01-2013 at 04:11 PM. Reason: grammar correction |
01-01-2013, 07:45 PM | #362 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
I only mentioned Similar Stories because I knew it was content based, I've never used it nor did I write it so sorry to hear it does not suit what you need.
Find Duplicates has always had two very intentional limits placed on its scope - first that it does not compare content, and second that it does not limit itself to a particular format. Comparing content across your library (as it sounds like Similar Stories does) is as you know an exceptionally slow operation and not at all appropriate to this plugin, neither is doing an epub specific comparison which is what you are after. For the majority of users, books will be identified by some aspect of their metadata as being duplicates. I have repeated ad infinitum in this thread that how a user chooses to "resolve" those duplicates is outside of what this plugin covers. The decision to merge, delete, or indeed more complexly binary compare contents of identified "duplicates" is all in the realm of either existing features of calibre or potentially new plugins for other users to write. I would suggest the latter if such a feature is important enough for you to justify the effort. Personally I have absolutely no use for it which is another reason why I have no interest in spending hours trying to craft it, and equally as explained above I do not see it as being an appropriate feature of this plugin. There is never any harm in asking or suggesting a new feature, but as always I reserve the right to say no to ones I don't think fit well Good luck with whatever solution you pursue. |
Advert | |
|
01-01-2013, 10:16 PM | #363 | |||
Junior Member
Posts: 6
Karma: 10
Join Date: Jun 2005
|
Quote:
unzip -qvl epubname.epub -x *.opf [& other metadata-y files] | cut -c 49-56 | sort | md5 [ this takes the crc32 values of each file in zip except those listed, sorts so the crc32s are in a known order, and generates a md5. Works perfectly for identifying things without ever unpacking the file. Should be lighting fast to generate.] As for a 'particular format', there are plenty of 'epub' only plugins, or items that only work on epubs. Such as Your addition here: https://www.mobileread.com/forums/sho...&postcount=482 In this case, don't generate hashes if it's not appropriate (I wonder if a similar method for mobis/etc would work though... ideas?) Quote:
Quote:
Last edited by sethcohn; 01-01-2013 at 10:24 PM. |
|||
01-02-2013, 11:09 PM | #364 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
@kiwidude: See https://bugs.launchpad.net/bugs/1095316
You need to pass is_second_db=True when constructing the second LibraryDatabase2 object to avoid clobbering the saved searches. |
01-03-2013, 03:07 AM | #365 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Ok, thanks Kovid, I will push a new version.
|
Advert | |
|
01-03-2013, 03:19 AM | #366 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
v1.6.1 Released
Changes in this release:
|
01-03-2013, 06:21 AM | #367 |
Well trained by Cats
Posts: 29,809
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Kiwidude
I noticed something after running this version (it may have been there before) Clearing the (search) results (after fixing issues) The column sort indicator is what I started with (last_Modified), The sort is NOT (might be by Title). Do you need to clear/move the sort marker, since you have released control to display the results and don't want to use the wrong sort? |
01-03-2013, 07:13 AM | #368 |
Calibre Plugins Developer
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
No behaviour has been changed with regards to sorting in this version.
|
01-03-2013, 10:49 AM | #369 |
Well trained by Cats
Posts: 29,809
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
The it probably has been this way forever, and it took me all this time to notice
Working around is easy, I just need to remember to resort the columns after clearing the 'marked' list. |
01-04-2013, 05:31 AM | #370 |
Addict
Posts: 302
Karma: 1250000
Join Date: Jan 2011
Location: UK
Device: Kobo Libra, iPadAir2, PRS600, iPhone 6, iPod, Palm TX
|
|
01-08-2013, 11:38 AM | #371 |
Zealot
Posts: 114
Karma: 6288
Join Date: Dec 2012
Device: iphone
|
I love this plugin. Especially that I can compare two libraries by Goodreads ID.
I have a request for further development of this plugin. I would love to see it compare two libraries by series, not index number, just series. So if library A had books Alex Delaware [1] and Alex Delaware [2] and library B had had Alex Delaware [10] and Alex Delaware [11], they would all come back as a match. This would be so helpful for me. Just a thought. Again thank you for developing Calibre and all of the amazing plugins that go with it. |
01-08-2013, 07:49 PM | #372 |
Resident Curmudgeon
Posts: 74,015
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
There is one think I'd like with this plugin. I'd like the binary compare to be able to compare just the books I have selected.
|
01-13-2013, 02:09 PM | #373 |
Addict
Posts: 224
Karma: 10
Join Date: Jul 2012
Device: Kindle
|
Kiwidude,
I use both the Find Duplicates and Import List plugins; both have helped me streamline managing large lists of ebooks. One thing I need to do is separate out Sci-Fi authors from all the other authors, independent of any tags. I presently do this the long way around, as I can't see a way to do it in Calibre with these plugins. 1. I have a spreadsheet with a column containing over 1,900 sci-fi author names, and varients on their names. 2. I export a given batch of ebooks to a directory, with a folder with each authors name. 3. I then use a Windows utility to copy the 'name of the folders', and then paste that list into the spreadsheet and do a compare there using the vlookup function. 4. I take the resultant 'hits' and make a new list that is the sci-fi authors only. 5. Using that list I maually cut & paste each of those authors folders into a new folder for sci-fi ebooks only. As you can see it is a bit tedious! My attempt to make Calibre/Find Duplicates work is to create a "Library" in Calibre that contans a list of my 1,900 sci-fi authors. Using two columns only, the Author column (one of interest) and a dummy Title column containing "Various Titles" for every author. Ok, then I get a new batch that I want to sort out the sci-fi authors from, so I do a "Library Compare" and select my Sci-Fi Authors List. I can't use title to compare of course, so I select "Ignore" for the Title, and "Similar" or "Exact" for the Author. However, when I use "Ignore" for the Title, it does not generate a list of books where the authors matche! i.e. I cannot get a list of ebooks I can "Select All" and then save to their own directory like I can if I do a normal compare for Author AND Title normally. If there is a way to make it work like this, I sure can't see how! So, yes, I would like to see a feature that would allow this unless it is already there and I just can't see it. Thanks. And apologies for the lengthy post; just wanted to make sure I explained the issue/need for this capability! Monty |
01-13-2013, 11:10 PM | #374 |
Dedicated
Posts: 441
Karma: 11279376
Join Date: Jun 2012
Location: Amarillo, TX
Device: iPad Mini 1 & 4, Nook ST, Dell 11-3000, iPhone 5s
|
Deleted libraries are in Find Duplicates "Cross Library Search"
Hello kiwidude,
I noticed that in Find Duplicates "Cross Library Search Options" the dropdown list of libraries contains a few calibre libraries that I have deleted. I had created a few test libraries as a sandbox for learning calibre; they have been deleted using the "Remove Library" function and the folders have also been deleted. Is there a relatively easy way to "prune deadwood" from the dropdown library list? Hope you're having a good New Year ... Thanks, Marty |
01-14-2013, 09:31 AM | #375 |
Enthusiast
Posts: 37
Karma: 41
Join Date: Nov 2011
Location: North Kingstown, RI, USA
Device: Kindle DX,Nexus 10,Fire HD
|
Change Request:: Add hook to select View Manager view for Duplicates list
First off, I have been able to save countless hours using your plug-ins. They are well written, well documented and easy to use/configure. And as an extra bonus, you fully support your plug-ins as well by providing valuable feed back and guidance on the threads. For this, I thank you. https://www2.mobileread.com/i/smiliestext/thankyou.gif
I use most of your plug-ins, but this request concerns Find Duplicates and View Manager. Request:: Would there be a way to add a hook into Find Duplicates which would allow a View Manager view to be used when displaying the found duplicates? Reason:: I have added numerous custom columns and use View Manager to allow me to view my book with difference columns/sorts (even with my resolution set to 1920 x 1080, I can normally only fit about 7 - 10 columns per view). I have added a view with format and quality columns which is helpful when determining which duplicate to retain. If I forget to set the view before I run Find Duplicates I then need to re-run Find Duplicates again after I set the view (due to sort of the view being applied which messes up the dup groupings). If the Find Duplicates plug-in was to switch to a specified View Manager view before processing/displaying dups, that would save some time. Thank you for your consideration. |
Tags |
cross library duplicates, in library duplicates |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1184 | 04-17-2024 06:17 PM |
[GUI Plugin] View Manager | kiwidude | Plugins | 414 | 04-13-2024 01:41 PM |
[GUI Plugin] Open With | kiwidude | Plugins | 403 | 04-01-2024 08:39 AM |
[GUI Plugin] Generate Cover | kiwidude | Plugins | 811 | 03-16-2024 11:31 PM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |