Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 10-31-2022, 09:23 AM   #1051
chaley
Grumpy old git
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
chaley's Avatar
 
Posts: 10,901
Karma: 4839799
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
I don't think there is a way in the current API of removing just one specific named marker once you have applied it - I am sure chaley can correct me on that if I am wrong. That would have been the quick change to the plugin I might have been willing to make.
There isn't a specific API but it is easy to do yourself. Here is an example that runs in the Template debugger so it is easy to see what it does. The work is in the function remove_val_from_marks().

Code:
python:
def evaluate(book, context):
	db = context.db
	new_marks = remove_val_from_marks(db, 'bbb')
	db.data.set_marked_ids(new_marks)
	return 'a string'

def remove_val_from_marks(db, val):
	return {k:v for k,v in db.data.marked_ids.items() if v != val}

Last edited by chaley; 10-31-2022 at 09:26 AM.
chaley is offline   Reply With Quote
Old 10-31-2022, 06:37 PM   #1052
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,441
Karma: 1823494
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis, iPad Pro
Find Duplicates v1.10.7 Released

Release Notes:
https://github.com/kiwidude68/calibr...icates-v1.10.7

Thanks to @chaley for the code suggestion!

@Eddie87 you should now be able to do that workflow discussed of applying a custom marker to the results, clearing the virtual library and then searching for your custom marker.
kiwidude is offline   Reply With Quote
Old 10-31-2022, 08:38 PM   #1053
dunhill
Fanatic
dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.dunhill can program the VCR without an owner's manual.
 
dunhill's Avatar
 
Posts: 581
Karma: 197652
Join Date: Sep 2017
Location: Argentina
Device: moon+ reader, kindle paperwhite
Quote:
Originally Posted by kiwidude View Post
Release Notes:
https://github.com/kiwidude68/calibr...icates-v1.10.7

Thanks to @chaley for the code suggestion!

@Eddie87 you should now be able to do that workflow discussed of applying a custom marker to the results, clearing the virtual library and then searching for your custom marker.
Thanks for these changes!
dunhill is offline   Reply With Quote
Old 11-01-2022, 06:20 PM   #1054
Eddie87
Junior Member
Eddie87 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Oct 2022
Device: Kindle
Thanks a lot!!

Quote:
Originally Posted by kiwidude View Post
Release Notes:
https://github.com/kiwidude68/calibr...icates-v1.10.7

Thanks to @chaley for the code suggestion!

@Eddie87 you should now be able to do that workflow discussed of applying a custom marker to the results, clearing the virtual library and then searching for your custom marker.
Thank you so much for the update.

My main library which I update every day and is the "master" for ceirtain books; today has 51310 books. On that one I add books, run your plugin to do a binary compare, I remove the newly added binare duplicates and then I compare title/author again using your plugin, and finally I manually check old and new versions and decide what version to keep among the duplicates.

I also have another one that includes more books (90271 today), on that I also add books every now and then, I also use your plugin to maintain.

A couple of times a month, I make sure to copy the ones that are on the "master" one and are not in the second one, for that I use the binary compare and copy the ones NOT in common.

Today I used the new plugin and found 50640 duplicated and 670 not marked after I removed the virtual library, all went fine today. Comparison takes a while that's te reason I only binary compare the libraries once or twice a month.
Eddie87 is offline   Reply With Quote
Old 11-07-2022, 08:02 AM   #1055
ownedbycats
Grand Sorcerer
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 5,560
Karma: 28562994
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Checking duplicates by identifier requires a separate search for each id type. Would it make sense to have "any" as an option in the dropdown?
ownedbycats is online now   Reply With Quote
Old 11-07-2022, 09:18 AM   #1056
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,441
Karma: 1823494
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis, iPad Pro
Quote:
Originally Posted by ownedbycats View Post
Checking duplicates by identifier requires a separate search for each id type. Would it make sense to have "any" as an option in the dropdown?
From a user perspective I can understand the request. From an implementation perspective I'm not exactly super excited about doing so as it would be a non-trivial change .

There are also two other complications to consider. The first is the same problem that a binary book duplicate search has - the user would not know "which" identifier is the duplicate in each pair.

Secondly it is entirely feasible for users to have hundreds or even thousands of different identifier types with all sorts of urn numbers etc (see the discussion recently in this thread about the dropdown of identifiers exploding in size), which might make such this search extremely slow.
kiwidude is offline   Reply With Quote
Old 11-16-2022, 07:28 AM   #1057
ownedbycats
Grand Sorcerer
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 5,560
Karma: 28562994
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
I found a glitch that may be partially the result of Find Duplicates. As I'm not entirely sure and didn't want to crosspost it, I posted here:

https://www.mobileread.com/forums/sh....php?p=4274079

Thanks
ownedbycats is online now   Reply With Quote
Old 12-04-2022, 06:16 PM   #1058
Fiammifero
Junior Member
Fiammifero began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Dec 2022
Device: Several
Hi Kiwidude and all,

I have been using the Find Duplicates a lot. It is incredibly powerful and useful.
However, after it shows a long list of resulting lines, I would like to be able to select (or mark) at once all the lines but the first line of each group. Is there a way to do that? I just cannot find it (after a lot of trial and googling).

I would gain a lot of time. After selecting, I could check and adjust the selection (exclude false duplicates) ; then delete all the selected lines at once.

If it is not currently possible, then maybe you could consider including that in a future version of the plugin ? In that case some more options would be nice to have, for instance exclude the smallest or the biggest book of each group when building the selection (instead of excluding the first one).

I am aware that the plugin can remove duplicate files after a binary compare. But I have hundreds of real duplicates that appear with identical or soundex matching, while not being not exactly the same file.

Thanks
Fiammifero is offline   Reply With Quote
Old 12-11-2022, 02:12 PM   #1059
Fiammifero
Junior Member
Fiammifero began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Dec 2022
Device: Several
Hello again... Well, anybody, no clue, really ? After finding duplicates, I would like to be able to select (or mark) at once all the epub rows but the first line of each group. Is there a way to do that?
Thus providing a much quicker way of deleting a huge number of duplicates (even when the files are not exact duplicates). Thanks !
Fiammifero is offline   Reply With Quote
Old 12-11-2022, 03:55 PM   #1060
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 28,225
Karma: 47704480
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2, K4NT(Fixed: New Bat.), Galaxy Tab A
IMHO
Don't
Nothing says the first is the better version.
Add the count pages Plugin (and configure at least the Pages column)
now you have an additional detail to consider
Next, consider the metadata shown for each. Some, All, mostly cr*p

Now choose, repeat the consider for each book

Use an Intake Library, Use the Find Library Duplicates option before the merge (Copy to Library: <main one>: Delete )
theducks is offline   Reply With Quote
Old 12-11-2022, 06:15 PM   #1061
capink
Guru
capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.
 
Posts: 867
Karma: 724338
Join Date: Aug 2015
Device: Kindle
Quote:
Originally Posted by Fiammifero View Post
Hello again... Well, anybody, no clue, really ? After finding duplicates, I would like to be able to select (or mark) at once all the epub rows but the first line of each group. Is there a way to do that?
Thus providing a much quicker way of deleting a huge number of duplicates (even when the files are not exact duplicates). Thanks !
Forget about marking books while in duplicate view. It cannot be done for technical reasons. Your best bet is to mark the books you want using the Reading List plugin instead, as follows:
  • Create a reading list called duplicates.
  • You can add books to this list using a keyboard shortcut. You can customize this shortcut by clicking Reading List > Customize > Others > Keyboard shortcuts.
  • Now, instead of marking books, add them to the duplicates list using the keyboard shortcut.
  • After leaving the find duplicates plugin, you can view all the books in the duplicates list you created (Reading list > view list > duplicates). From there you can delete them or perform any other action you wish on them.
capink is offline   Reply With Quote
Old 12-11-2022, 07:07 PM   #1062
Fiammifero
Junior Member
Fiammifero began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Dec 2022
Device: Several
Many thanks, theducks and capink, very clever ideas ! However only for the case of identical titles, authors and languages, if I understood you.

First I followed theducks process :
I found these duplicates (same titles, authors) and corrected their language field.
I adjusted the merging options for copying to another library.
I adjusted this selectes list of books (using a special column) to manage the case of different formats (PDF vs EPUB)
I created a temporary library, and moved these selected books there, then back to my main library.
This allowed to delete 70 rows within a list of 140 duplicates, in a few minutes : thank you for that !

And I also used your idea to look at the pages count.

But now 1000 duplicates with similar titles and similar author still remain... And most are real duplicates.
In many cases, the author is spelled lightly differently between two duplicates. I know that the plugin can address that, but it seems a very long process.

Then I tried capink method. Through Options/Advanced I associated the CTL+K shortcut with Add to "DUP" list in the Reading List plugin.
This allows to keep trace of the 1000 Find Duplicates results after quitting the Find Duplicates plugin (another way would be to write in some custom column).
But then I am more or less stuck. I can manually delete them, but I have 1000 rows.

So I still guess it would very very useful to be able to select all rows but the first one in each duplicates group.
Or if this is not possible for technical reasons, maybe it is possible to export and reimport the list to excel ? Then I would edit it within excel.

I will investigate further, but not tonight Thanks again
Fiammifero is offline   Reply With Quote
Old 12-13-2022, 12:07 PM   #1063
Fiammifero
Junior Member
Fiammifero began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Dec 2022
Device: Several
Hello again
For the information of whoever interested -
Eventually, I found a way to delete quickly a list of 1000 probable duplicates:

Find Duplicates plugin with same titles but similar authors : 2000 rows.
I selected only the epub books. I move them to a temporary library (slow but automatic).

Then I export that list to excel (through Create a catalog / CSV).
With excel formulas, I find the lines where the same line has the same title and a similar number of pages (threshold of 80 pages).
I export them to a CSV file, keeping the columns : UUID, title, authors.

Then I use the Import List Plugin to import this CSV with matching method = UUID. I import that in a DUP list created with the Reading List plugin. Then I delete the 1000 books at once in the DUP list.

Thus I manage by hand only a few cases needing special attention.
Many thanks again
Fiammifero is offline   Reply With Quote
Old 12-13-2022, 01:03 PM   #1064
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 6,723
Karma: 55555555
Join Date: Jun 2017
Location: Ireland
Device: Both Kinds: epub based makes and Kindle
Beware identical titles that are unrelated books:
Bambi (one is woodland, other a romance, a nickname)
Dancer's Luck (One is SF&F fantasy and other Ballet)

My problem is different editions with same or slightly differing titles (and or author name) and stupid Gutenberg often puts their release date rather than print edition date in Published Metadata date.

I've not figured out yet how to use the plug-in. Also I don't want to delete an older duplicate as different versions might have been sent to different ereaders.

Should I add a preference rank column for the same book but content doesn't exactly match?
Quoth is offline   Reply With Quote
Old 12-13-2022, 02:11 PM   #1065
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 28,225
Karma: 47704480
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2, K4NT(Fixed: New Bat.), Galaxy Tab A
Find Duplicates Removes NOTHING, so feel free to experiment and peruse the results. Since they are marked in the regular GUI, you can View... Edit metadata (further. FD uses Calibre metadata, not the book)
The types of searches are configurable. The Search method is configurable.
AND you can exempt books from future results
theducks is offline   Reply With Quote
Reply

Tags
cross library duplicates, in library duplicates

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Generate Cover kiwidude Plugins 788 01-20-2023 10:49 PM
[GUI Plugin] Quality Check kiwidude Plugins 1134 01-07-2023 03:28 AM
[GUI Plugin] Open With kiwidude Plugins 401 01-03-2023 09:00 AM
[GUI Plugin] View Manager kiwidude Plugins 388 12-19-2022 12:55 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 01:27 PM


All times are GMT -4. The time now is 05:53 PM.


MobileRead.com is a privately owned, operated and funded community.