Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 01-25-2022, 01:19 PM   #946
mseiden
Junior Member
mseiden began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2013
Device: kindle
soundex uses unlimited cpu and virtual memory.

hi there. thanks for Find Duplicates. It's very useful. However, recently, the Soundex setting has been misbehaving. It appears that no matter what the Author setting is (even "ignore") it runs for a very long time -- in process now has used an hour (!) of cpu time on an m1 macbook pro (admittedly on a library with about 130k items), but worse, also uses up all of Application Memory (it's currently using about 142GB according to Activity Monitor).

eventually it runs out of memory, calibre needs to be killed, mac os needs to be restarted. (mac os monterey is not good at continuing already frozen processes after that.)

Other Find Duplicate options besides Soundex apparently work as before. Any ideas or suggestions?
mseiden is offline   Reply With Quote
Old 01-25-2022, 05:13 PM   #947
mseiden
Junior Member
mseiden began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2013
Device: kindle
My current run of Find Duplicates with Soundex and ignore Author ran out of application memory after using 225GB, and using 2 hours of cpu time on an M1 Macbook Pro. Looks like something n**2 or otherwise pathological is going on here.
mseiden is offline   Reply With Quote
Advert
Old 01-25-2022, 06:02 PM   #948
capink
Wizard
capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.
 
Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
Quote:
Originally Posted by mseiden View Post
My current run of Find Duplicates with Soundex and ignore Author ran out of application memory after using 225GB, and using 2 hours of cpu time on an M1 Macbook Pro. Looks like something n**2 or otherwise pathological is going on here.
See post below

Quote:
Originally Posted by capink View Post
The issue you are facing is probably related to how the plugin re-partitions the duplicates based on exemptions list. I don't have the time right now to dig deeper into it. Maybe will return to this in the future.

My guess is that if you remove all the exemptions (after backing them up), the problem will disappear. You can backup the settings this way: Find Duplicates > Customize Plugin > View library preferences > copy the preferences to a text file.
capink is offline   Reply With Quote
Old 01-27-2022, 05:25 PM   #949
mseiden
Junior Member
mseiden began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2013
Device: kindle
thanks, very helpful.

by the way, when i plug my library disk into an Intel Mac running Find Duplicates/Soundex works perfectly using both back level and current versions of calibre and the plugin.

you were right, there was something pathological about my library preferences, on an M1 mac. i couldn't even display them (to save them) using the instructions above (spinning pinwheel, intensive use of cpu time for half an hour before giving up.). Nonetheless and strangely, an intel mac running current catalina with current calibre and current plugin had no problem with the Find duplicates operation on the very same exclusions. isn't that strange?

Plugin settings aren't in Library/Application Support/. I see them in metadata_db_prefs*.json, which in my case was "only" a 57MB json file with a crapload of settings, and 21k in size after clearing the exclusions.
(It's easier to copy this file using cp than use the gui.)

thanks for your help. (i mention all these details only in case someone else runs into a similar problem.)
mseiden is offline   Reply With Quote
Old 01-28-2022, 07:14 AM   #950
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,079
Karma: 14079267
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by mseiden View Post
thanks for your help. (i mention all these details only in case someone else runs into a similar problem.)
I had a the same problem; using 'Soundex' for Title ate all my RAM. Cleaning out the exclusions and redoing them as @capink suggested fixed it.
mbovenka is offline   Reply With Quote
Advert
Old 02-09-2022, 06:15 AM   #951
anoukaimee
Enthusiast
anoukaimee began at the beginning.
 
anoukaimee's Avatar
 
Posts: 30
Karma: 10
Join Date: Aug 2017
Location: USA
Device: Kobo Clara, Pixel 6 (Moon Reader/Calibre Companion), Amazon Fire 10
Question Is it possible to exempt variants in METADATA for future searches?

First time poster, long time fan. Your plugin has saved me what I can only imagine would otherwise be countless hours of ADD mania, hyperfocusing to find dupes manually. Thank you!

But one feature that I'm either not finding or does not exist: is there a way to exempt found variants when using the "find metadata variations . . ." tool, in the same way that one can mark duplicates as exempt in future searches? Find I'm often going over the same old Jonathan Smiths and John Smiths, and it'd be nice if there was a way to avoid that. I don't see anything obvious (and I did search in this thread, alas no hits). Am I missing something, is this possible to implement, or have you ruled it out?

Also, as an aside: are you taking recs on possible "similar" names that wouldn't be caught using your algorithms? There's always your Roberts and Bobs, your Katherines and Cathryns, and they don't get caught. I'd be happy to keep a running list as I'm going thru mine [which I do embarassingly often because cleaning data is weirdly soothing lol]; if you'd like the data, let me know where I might send it. Or, if there is an algorithm that I should be using but just don't understand, would love a push in the right direction.

Thanks much for this and your other plugins!
anoukaimee is offline   Reply With Quote
Old 02-10-2022, 05:01 AM   #952
capink
Wizard
capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.
 
Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
Quote:
Originally Posted by anoukaimee View Post
But one feature that I'm either not finding or does not exist: is there a way to exempt found variants when using the "find metadata variations . . ." tool, in the same way that one can mark duplicates as exempt in future searches? Find I'm often going over the same old Jonathan Smiths and John Smiths, and it'd be nice if there was a way to avoid that. I don't see anything obvious (and I did search in this thread, alas no hits). Am I missing something, is this possible to implement, or have you ruled it out?
No such feature exist. Also the books exemption functionality have some problems, as evidenced by the posts above yours. It needs to be fixed before looking at re-implementing it for authors. I don't have the time nor the inclination to do either of these.

Quote:
Originally Posted by anoukaimee View Post
Also, as an aside: are you taking recs on possible "similar" names that wouldn't be caught using your algorithms? There's always your Roberts and Bobs, your Katherines and Cathryns, and they don't get caught. I'd be happy to keep a running list as I'm going thru mine [which I do embarassingly often because cleaning data is weirdly soothing lol]; if you'd like the data, let me know where I might send it. Or, if there is an algorithm that I should be using but just don't understand, would love a push in the right direction.
The advanced mode supports custom user defined algorithms, but you need to install another plugin to enable this feature. It would not be complicated to implement an algorithm that supports what you want. But I guess it would incur some performance penalties.
capink is offline   Reply With Quote
Old 02-16-2022, 02:16 AM   #953
Rellwood
Library Breeder (She/Her)
Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.
 
Rellwood's Avatar
 
Posts: 1,268
Karma: 1937891
Join Date: Apr 2015
Location: Fullerton, California
Device: Paperwhite 2015 (2), PW 2024 (12 GEN), PW 2023 (11 GEN), Scribe (1st)
Ok, I feel dumb. I saw on the introduction page that this has the capability of searching more than one column - ie checking tags and genres columns against each other for duplicates and then maybe choosing which one to keep? Is that right? I can't find that ability to search multiple columns for metadata. I see the advanced options, and when I select that I just get one column to check and I have to set the search parameters.

Also, for the life of me, what exactly is "length"? I have never understood what that is for.

I am so looking forward to fixing tha metadata!
Rellwood is offline   Reply With Quote
Old 02-16-2022, 06:06 AM   #954
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,076
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
I think it means number of consonants to match
rell wood = 2
theducks is offline   Reply With Quote
Old 02-16-2022, 06:27 AM   #955
capink
Wizard
capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.capink ought to be getting tired of karma fortunes by now.
 
Posts: 1,196
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
Quote:
Originally Posted by Rellwood View Post
Ok, I feel dumb. I saw on the introduction page that this has the capability of searching more than one column - ie checking tags and genres columns against each other for duplicates and then maybe choosing which one to keep? Is that right? I can't find that ability to search multiple columns for metadata. I see the advanced options, and when I select that I just get one column to check and I have to set the search parameters.
No, it cannot do this. Not for books in the same library. What is meant by multiple columns (in the advanced mode) is the ability to compare books using more than the standard two columns (in the regular mode). e.g. finding duplicates using title, author and series.

Edit: If you want to compare different columns for the same book, you can use calibre's template search. For more details consult calibre manual. Also note that you need to be familiar with the the template language.

Last edited by capink; 02-16-2022 at 06:54 AM.
capink is offline   Reply With Quote
Old 02-28-2022, 08:12 PM   #956
Rellwood
Library Breeder (She/Her)
Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.Rellwood ought to be getting tired of karma fortunes by now.
 
Rellwood's Avatar
 
Posts: 1,268
Karma: 1937891
Join Date: Apr 2015
Location: Fullerton, California
Device: Paperwhite 2015 (2), PW 2024 (12 GEN), PW 2023 (11 GEN), Scribe (1st)
Awww...

I made the mistake of downloading my current goodreads bookshelves in my old bookshelves column and wanted to weed them out...

I have no problem finding the matches, it's just isolating them and removing them.

Being able to compare two different columns against each other to remove duplicates, with the setting already enacted for the desired column the duplicate is in to be either saved or deleted. One step - find duplicates, set delete identical duplicates in X column.
Rellwood is offline   Reply With Quote
Old 03-14-2022, 10:58 AM   #957
mike the leg
Junior Member
mike the leg began at the beginning.
 
Posts: 5
Karma: 10
Join Date: May 2021
Device: kindle
Pluggin works wonders and will find all your duplicates is you work through it stage by stage, but when you find you have several hundred duplicates it can be a bore working your way through them.
This proceedure is hairy but it works if all your books are one format.
Find all your dupicates, Title and Author, once they are found, select them all, ctrl A, and save them to a single folder, then delete all the selected books using Calibre delete selected.
Close Calibre
Now goto the folder you saved all the duplicates in and you will see (example)
Mansfield Park - Jane Austin.epub
Mansfield Park - Jane Austin(1).epub
(or .mobi or .azw3 etc)

Now search for all the the duplicates
name:~"*(1).epub" or .mobi, etc
select them all, ctrl A and delete them

If you have serverla copies of the same books you may have to repeat the search
name:~"*(2).epub" delete them
name:~"*(3).epub" delete them
Utill the search does not find any more books.
Now
Re-open Calibre, point the auto add function to the folder with all the remaining books in after the dublicates have been removed, (Preferences, Import/Export, Adding Books, Automatic adding, specify a folder)
Calibre will add your books books back into the libary
Job done
Unfortunately, similar, titles, similar authors etc still have to be done by hand
mike the leg is offline   Reply With Quote
Old 03-15-2022, 05:30 AM   #958
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,792
Karma: 146391129
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
You can't delete arbitrarily based on (1) in the filename. How do you know it's the same edition or that it's not a newer updated version of the same eBook? You really should compare them unless you know for sure they are the same eBook and in which case, a binary compare will match.
JSWolf is offline   Reply With Quote
Old 03-26-2022, 03:53 AM   #959
PPP-Magic
Junior Member
PPP-Magic began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Mar 2022
Device: Various
Plugin was working until I tried to update it. This is what I got:

calibre, version 3.21.0
ERROR: Install plugin failed: A problem occurred while installing this plugin. This plugin will now be uninstalled. Please post the error message in details below into the forum thread for this plugin and restart calibre.

Traceback (most recent call last):
File "/usr/lib/calibre/calibre/gui2/dialogs/plugin_updater.py", line 726, in _install_clicked
plugin = add_plugin(zip_path)
File "/usr/lib/calibre/calibre/customize/ui.py", line 461, in add_plugin
plugin = load_plugin(path_to_zip_file)
File "/usr/lib/calibre/calibre/customize/ui.py", line 60, in load_plugin
return loader.load(path_to_zip_file)
File "/usr/lib/calibre/calibre/customize/zipplugin.py", line 219, in load
ans.minimum_calibre_version))))
InvalidPlugin: The plugin at /tmp/calibre_3.21.0_tmp_LAalC1/EWW7JT.zip needs a version of calibre >= 5.13.0
PPP-Magic is offline   Reply With Quote
Old 03-26-2022, 06:27 AM   #960
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,076
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Read the last line.
3.21 does not meet the PI requirements.

I suspect your distro is typical and WAY BEHIND.

Use the command found on the Calibre Linux Download page and join us with a modern 5.x version.

Your other option, is to uninstall the PI and find someone with an old version of it
theducks is offline   Reply With Quote
Reply

Tags
cross library duplicates, in library duplicates


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Generate Cover kiwidude Plugins 862 07-24-2025 08:49 PM
[GUI Plugin] View Manager kiwidude Plugins 416 07-16-2025 05:35 PM
[GUI Plugin] Quality Check kiwidude Plugins 1251 07-07-2025 09:13 PM
[GUI Plugin] Open With kiwidude Plugins 404 02-21-2025 05:42 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 03:25 PM.


MobileRead.com is a privately owned, operated and funded community.