Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 01-01-2013, 04:10 PM   #361
sethcohn
Junior Member
sethcohn began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Jun 2005
Similar Stories won't do what I'm looking for at all. It's completely different.
It looks at a single item, creates an index, and looks for matching items, ranking every single item via a new column. That's completely unneeded, and counter productive.

Duplicate Finder creates a hash table, the right way of doing this. The difference is creating a hash of the entire file, and creating a hash of only the items that matter and ignoring those that don't (the metadata.opf file, for example, guaranteed to be different in small ways if altered at all)

If you aren't interested, fine. Kovid suggested your plugin, and I gave you (and him) the courtesy of asking here, despite that you clearly aren't interested. My original post (linked above) points to tools outside of Calibre that are useful for this sort of comparison, and frankly, I see the value of doing it from within Calibre (especially for importing large quantity of fresh books into an already large library, perhaps from completely public domain sources...) even if you still don't. I hope someone else out there finds the pointer useful, and I hope someone creates a plugin that will do this sort of Dupe checking (content based hashes, not file based hashes) one of these days.

Last edited by sethcohn; 01-01-2013 at 04:11 PM. Reason: grammar correction
sethcohn is offline   Reply With Quote
Old 01-01-2013, 07:45 PM   #362
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
I only mentioned Similar Stories because I knew it was content based, I've never used it nor did I write it so sorry to hear it does not suit what you need.

Find Duplicates has always had two very intentional limits placed on its scope - first that it does not compare content, and second that it does not limit itself to a particular format. Comparing content across your library (as it sounds like Similar Stories does) is as you know an exceptionally slow operation and not at all appropriate to this plugin, neither is doing an epub specific comparison which is what you are after.

For the majority of users, books will be identified by some aspect of their metadata as being duplicates. I have repeated ad infinitum in this thread that how a user chooses to "resolve" those duplicates is outside of what this plugin covers. The decision to merge, delete, or indeed more complexly binary compare contents of identified "duplicates" is all in the realm of either existing features of calibre or potentially new plugins for other users to write. I would suggest the latter if such a feature is important enough for you to justify the effort. Personally I have absolutely no use for it which is another reason why I have no interest in spending hours trying to craft it, and equally as explained above I do not see it as being an appropriate feature of this plugin. There is never any harm in asking or suggesting a new feature, but as always I reserve the right to say no to ones I don't think fit well

Good luck with whatever solution you pursue.
kiwidude is offline   Reply With Quote
Advert
Old 01-01-2013, 10:16 PM   #363
sethcohn
Junior Member
sethcohn began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Jun 2005
Quote:
Originally Posted by kiwidude View Post
Find Duplicates has always had two very intentional limits placed on its scope - first that it does not compare content, and second that it does not limit itself to a particular format.
In a matter of minutes I figured out a reasonable method of generating hashes without ever looking at 'content' (merely stripping out the metadata portion):
unzip -qvl epubname.epub -x *.opf [& other metadata-y files] | cut -c 49-56 | sort | md5 [ this takes the crc32 values of each file in zip except those listed, sorts so the crc32s are in a known order, and generates a md5. Works perfectly for identifying things without ever unpacking the file. Should be lighting fast to generate.]

As for a 'particular format', there are plenty of 'epub' only plugins, or items that only work on epubs. Such as Your addition here: https://www.mobileread.com/forums/sho...&postcount=482

In this case, don't generate hashes if it's not appropriate (I wonder if a similar method for mobis/etc would work though... ideas?)

Quote:
For the majority of users, books will be identified by some aspect of their metadata as being duplicates.
And I'm telling you very clearly: I've seen files where the metadata was the same, yet the book was different (generated differently, different image sizes, for example), and vice versa, where metadata differences hide that the book contents are identical, and merely based on who created it when and how.

Quote:
I reserve the right to say no to ones I don't think fit well
Good luck with whatever solution you pursue.
Of course you do... I understand that... I hope someone else who is interested in this steps up. Seems like a pretty simple plugin (perhaps in conjunction with Find Duplicates: populate a metadata column with this hash for all items, then use your plugin to remove one of the items with the matching/duplicate hash.)

Last edited by sethcohn; 01-01-2013 at 10:24 PM.
sethcohn is offline   Reply With Quote
Old 01-02-2013, 11:09 PM   #364
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,775
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@kiwidude: See https://bugs.launchpad.net/bugs/1095316

You need to pass is_second_db=True when constructing the second LibraryDatabase2 object to avoid clobbering the saved searches.
kovidgoyal is offline   Reply With Quote
Old 01-03-2013, 03:07 AM   #365
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Ok, thanks Kovid, I will push a new version.
kiwidude is offline   Reply With Quote
Advert
Old 01-03-2013, 03:19 AM   #366
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
v1.6.1 Released

Changes in this release:
  • Fix for when comparing library duplicates to ensure saved searches are not corrupted.
kiwidude is offline   Reply With Quote
Old 01-03-2013, 06:21 AM   #367
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Kiwidude
I noticed something after running this version (it may have been there before)

Clearing the (search) results (after fixing issues)
The column sort indicator is what I started with (last_Modified),
The sort is NOT (might be by Title).
Do you need to clear/move the sort marker, since you have released control to display the results and don't want to use the wrong sort?
theducks is offline   Reply With Quote
Old 01-03-2013, 07:13 AM   #368
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
No behaviour has been changed with regards to sorting in this version.
kiwidude is offline   Reply With Quote
Old 01-03-2013, 10:49 AM   #369
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,689
Karma: 54369090
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
The it probably has been this way forever, and it took me all this time to notice

Working around is easy, I just need to remember to resort the columns after clearing the 'marked' list.
theducks is offline   Reply With Quote
Old 01-04-2013, 05:31 AM   #370
Valkrider
Addict
Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.Valkrider ought to be getting tired of karma fortunes by now.
 
Valkrider's Avatar
 
Posts: 293
Karma: 1250000
Join Date: Jan 2011
Location: UK
Device: Kobo Libra, iPadAir2, PRS600, iPhone 6, iPod, Palm TX
Quote:
Originally Posted by theducks View Post
The it probably has been this way forever, and it took me all this time to notice

Working around is easy, I just need to remember to resort the columns after clearing the 'marked' list.
It has always been present, a bit annoying but not a real issue.
Valkrider is offline   Reply With Quote
Old 01-08-2013, 11:38 AM   #371
tarisea
Zealot
tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.tarisea got an A in P-Chem.
 
Posts: 114
Karma: 6288
Join Date: Dec 2012
Device: iphone
I love this plugin. Especially that I can compare two libraries by Goodreads ID.

I have a request for further development of this plugin. I would love to see it compare two libraries by series, not index number, just series. So if library A had books Alex Delaware [1] and Alex Delaware [2] and library B had had Alex Delaware [10] and Alex Delaware [11], they would all come back as a match. This would be so helpful for me. Just a thought.

Again thank you for developing Calibre and all of the amazing plugins that go with it.
tarisea is offline   Reply With Quote
Old 01-08-2013, 07:49 PM   #372
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,650
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
There is one think I'd like with this plugin. I'd like the binary compare to be able to compare just the books I have selected.
JSWolf is online now   Reply With Quote
Old 01-13-2013, 02:09 PM   #373
MontyJ
Addict
MontyJ began at the beginning.
 
Posts: 224
Karma: 10
Join Date: Jul 2012
Device: Kindle
Kiwidude,

I use both the Find Duplicates and Import List plugins; both have helped me streamline managing large lists of ebooks.

One thing I need to do is separate out Sci-Fi authors from all the other authors, independent of any tags.

I presently do this the long way around, as I can't see a way to do it in Calibre with these plugins.

1. I have a spreadsheet with a column containing over 1,900 sci-fi author names, and varients on their names.

2. I export a given batch of ebooks to a directory, with a folder with each authors name.

3. I then use a Windows utility to copy the 'name of the folders', and then paste that list into the spreadsheet and do a compare there using the vlookup function.

4. I take the resultant 'hits' and make a new list that is the sci-fi authors only.

5. Using that list I maually cut & paste each of those authors folders into a new folder for sci-fi ebooks only.

As you can see it is a bit tedious!

My attempt to make Calibre/Find Duplicates work is to create a "Library" in Calibre that contans a list of my 1,900 sci-fi authors. Using two columns only, the Author column (one of interest) and a dummy Title column containing "Various Titles" for every author.

Ok, then I get a new batch that I want to sort out the sci-fi authors from, so I do a "Library Compare" and select my Sci-Fi Authors List. I can't use title to compare of course, so I select "Ignore" for the Title, and "Similar" or "Exact" for the Author.

However, when I use "Ignore" for the Title, it does not generate a list of books where the authors matche! i.e. I cannot get a list of ebooks I can "Select All" and then save to their own directory like I can if I do a normal compare for Author AND Title normally.

If there is a way to make it work like this, I sure can't see how! So, yes, I would like to see a feature that would allow this unless it is already there and I just can't see it.

Thanks. And apologies for the lengthy post; just wanted to make sure I explained the issue/need for this capability!

Monty
MontyJ is offline   Reply With Quote
Old 01-13-2013, 11:10 PM   #374
MartyTX
Dedicated
MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.MartyTX ought to be getting tired of karma fortunes by now.
 
MartyTX's Avatar
 
Posts: 441
Karma: 11279376
Join Date: Jun 2012
Location: Amarillo, TX
Device: iPad Mini 1 & 4, Nook ST, Dell 11-3000, iPhone 5s
Question Deleted libraries are in Find Duplicates "Cross Library Search"

Hello kiwidude,

I noticed that in Find Duplicates "Cross Library Search Options" the dropdown list of libraries contains a few calibre libraries that I have deleted. I had created a few test libraries as a sandbox for learning calibre; they have been deleted using the "Remove Library" function and the folders have also been deleted.

Is there a relatively easy way to "prune deadwood" from the dropdown library list?

Hope you're having a good New Year ...

Thanks,
Marty
MartyTX is offline   Reply With Quote
Old 01-14-2013, 09:31 AM   #375
edwdecarlo
Enthusiast
edwdecarlo began at the beginning.
 
Posts: 37
Karma: 41
Join Date: Nov 2011
Location: North Kingstown, RI, USA
Device: Kindle DX,Nexus 10,Fire HD
Lightbulb Change Request:: Add hook to select View Manager view for Duplicates list

First off, I have been able to save countless hours using your plug-ins. They are well written, well documented and easy to use/configure. And as an extra bonus, you fully support your plug-ins as well by providing valuable feed back and guidance on the threads. For this, I thank you. https://www2.mobileread.com/i/smiliestext/thankyou.gif

I use most of your plug-ins, but this request concerns Find Duplicates and View Manager.

Request::
Would there be a way to add a hook into Find Duplicates which would allow a View Manager view to be used when displaying the found duplicates?

Reason::
I have added numerous custom columns and use View Manager to allow me to view my book with difference columns/sorts (even with my resolution set to 1920 x 1080, I can normally only fit about 7 - 10 columns per view). I have added a view with format and quality columns which is helpful when determining which duplicate to retain. If I forget to set the view before I run Find Duplicates I then need to re-run Find Duplicates again after I set the view (due to sort of the view being applied which messes up the dup groupings). If the Find Duplicates plug-in was to switch to a specified View Manager view before processing/displaying dups, that would save some time.

Thank you for your consideration.
edwdecarlo is offline   Reply With Quote
Reply

Tags
cross library duplicates, in library duplicates

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Quality Check kiwidude Plugins 1171 03-23-2024 05:18 AM
[GUI Plugin] View Manager kiwidude Plugins 413 03-17-2024 12:01 AM
[GUI Plugin] Open With kiwidude Plugins 402 03-16-2024 11:44 PM
[GUI Plugin] Generate Cover kiwidude Plugins 811 03-16-2024 11:31 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 12:15 PM.


MobileRead.com is a privately owned, operated and funded community.