![]() |
#136 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
|
![]() |
![]() |
![]() |
#137 | |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
I've just sat down to take a look into this. You might be interested in the problem code actually - as it is based on your example from your post here a while ago. If you take your example and change your "initial_dups = [2, 3, 66, 7, 10, 11, 12]" to "initial_dups = [i for i in xrange(1,100)]" you will see it runs forever... I'm sure there are optimisations that can be done to it, but it doesn't seem to scale well in its current guise. |
|
![]() |
![]() |
Advert | |
|
![]() |
#138 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
Quote:
![]() Or, radio buttons, but sliders look nicer off course identical - similar - sound-ex - ignore EDIT: And: mark - highlight - group-column - number-group-column This could add 2 columns to the view 1 with groupnumers OR one with number/groupnumber This would be a nice sorting option For example you could have 2 groups. 1 with 2 dups 1 with 3 dups. You could then sort on numbers in a group. A number-group-column would look like this in previous case number-group-column 2.1 2.1 3.2 3.2 3.2 rather than sorting on group, you could sort on number of dups. While the groupnumber is in the column to, you will be able to sort on number of dups keeping the groups together. Last edited by drMerry; 04-24-2011 at 02:21 PM. Reason: added sorting option suggestion |
|
![]() |
![]() |
![]() |
#139 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Haha, yeah. Just when I had the gui looking all pretty too. I had at least already partially refactored the code in anticipation of something like this. Back to the drawing board for the gui again...
|
![]() |
![]() |
![]() |
#140 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
And some other ideas.
When on group-mode (non-highlight) 2 options: Merg current group Merg all groups When on highlight mode 3 options: * Merge all groups * Automatically change selected group to current selected book (this could be difficult because you have to deal with multiple selected books) * Automatically remove merged group out of view |
![]() |
![]() |
Advert | |
|
![]() |
#141 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@chaley - actually I think I found another issue with that algorithm. I don't it actually "works".
![]() For instance if I set dups = [(3,4),(3,5)] initial_dups=[1,2,3,4,5,6] The results it gives me are: [1,2,3,4,5],[1,2,4,5,6] Look at the first group - it has 3 and 4 together. Yet they are specifically exempted from appearing together in a group, and instead 6 has been removed? |
![]() |
![]() |
![]() |
#142 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@drMerry - thx for adding suggestions.
I've worked very hard to avoid adding custom columns as they add a whole layer of complexity and issues that is best avoided if possible. At the moment the groups are sorted in an "alphabetical" way. I would rather just offer a suboption on the find duplicates options gui to let you sort them by # duplicates (and alphabetically within that). The question is - do others find that of interest or is it niche? In terms of the merge menu options, it should already automatically remove a merged group out of the view when you go "next group" - is that not happening? So the only thing from your list that I see missing is "Change selected group to selected book". The problem with that is a book can be in multiple groups, so how would it know which group you meant? For instance if I have books 1,2,3 and exemption (2,3) then I will have group A of (1,2) and group B of (1,3). So if I selected book 1, how would I know you wanted to see Group A or Group B? Last edited by kiwidude; 04-24-2011 at 02:50 PM. |
![]() |
![]() |
![]() |
#143 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
The sorting-option is the only thing I need. If it can be done simple without a column, it's fine (great) to me.
Next group is not an option I used at this moment. I use the mark option so had no need to hit next. I will try it. Your last problem. If books 1 and 2 are duplicates and 1 and 3 are duplicates. 2 and 3 will be (just like 3 and 1, 3 and 2 and 2 and 1) Isn't it? That's why I think it is no problem. Also, if you would like to sort on number of duplicates, where would you put 1 in your case? EDIT: It does remove with ctrl+\ Last edited by drMerry; 04-24-2011 at 04:26 PM. |
![]() |
![]() |
![]() |
#144 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
Quote:
In my case it is less work to mark books not to merge than to select each group and than merge them. It would be nice to have an option that could do (one of): * merge all groups (separate off-course) * merge current group * merge all but current group * merge all groups - only if type is different (this would merge pdf and epub for example, but not a epub,pdf and pdf) |
|
![]() |
![]() |
![]() |
#145 | |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
Say in your library you have these two books: #2 Foo (Omnibus) #3 Foo Now you run a duplicates check using similar title. It reports these books, as it will (in future) strip off stuff like (Omnibus). However you decide they are not duplicates of each other, so you mark the group as exempt. Then at some point you add another book to your library (or maybe you changed the name for an existing book). #1 Foo: Returns When you run the duplicate check again, in the first pass it sees 1,2,3 as all being in the same duplicate group. However you have already said that 2 and 3 are not duplicates of each other. Nothing can be said from that about whether 1 matches 2 or 1 matches 3. So the user must be presented with two groups of the individual pairs. |
|
![]() |
![]() |
![]() |
#146 | |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
Instead I would see a separate menu option on the merge menu which for your selected rows comes up with an "automerge" type suggestion as your starting point in the gui. You would be able to launch the viewer for when you have conflicting formats to decide which you want to keep, decide what to do with the metadata etc. As Starson has said the existing merge options would stil exist, this one would just be additional to give you a gui allowing you to preview what the effect would be and tune it. I have way too many things on my todo list already but it could be done as a standalone plugin by someone as a starting point. |
|
![]() |
![]() |
![]() |
#147 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
Quote:
So then you have to decide to do one of the following I think: 1. Do not show any books marked as not duplicate (thereat it as 'never ask again') 2. Add a popup telling: found new duplicates in hidden books show them? 3. Add a checkbox during start of process: "Search Including Hidden Items (only if new books found)" 4. Add all books in 1 group if new book is found and remove the not duplicate status. But if this is sorted out, no need to restart the discussion |
|
![]() |
![]() |
![]() |
#148 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@drMerry - it is sorted out, provided you work through resolving the groups in order. As a book can only appear highlighted in one group at a time. If you display one group at a time there is no issue.
EDIT: I should add - this is where mucking around with the sort order such as by # of duplicates may cause a user more problems than it is worth. In the example above, groups A and B of (1,2) and (1,3) will appear next to each other in the gui. So at least as a user you can see all three books displayed next to each other, even though only the first group is highlighted of the first two rows. So you have a chance of anticipating the three way merge and effectively resolving two groups at the same time. If instead the groups are sorted by the more arbitrary # of duplicates, this may not be the case. I'll be honest and say I would never use # of duplicates as a criteria to sort by - to me what is important is to get genuine duplicates resolved, the # in the group of there being 2 or 10 makes no difference to me. So my inclination is to not offer it as an option (given the subtle issues like I mention above) unless someone provides a compelling reason why it would be valuable. Last edited by kiwidude; 04-24-2011 at 06:13 PM. |
![]() |
![]() |
![]() |
#149 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
|
Thank you for your clear explanation.
The reason I would like to use it is because of this: I get a lot of new books at a time. I add them all at once. I often do not have the time to look for all duplicates. If I have some time, it is nice to do as much as possible, so in that case, I could make a duplicate list and sort the list on number of duplicates. Than I could remove the duplicates that occurs the most. In that case my lib is cleaned the most, I regain the most free space and my database is cleaned also so I can make it smaller and speed up calibre. That are my reasons to use it this way. |
![]() |
![]() |
![]() |
#150 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Here is a preview of the new search options dialog. I thought radio buttons might be a little easier to select than sliders which are a mess to line up with lack of native tick label support in Qt.
Another option I am considering on this dialog is to remove the "Title match:" and "Author match:" labels on the left, as per the second screenshot. Any preferences? The text changes based on your selection obviously. If you choose Ignore Title and Ignore Author then the text tells you that it is an ISBN search. And just for drMerry there is a checkbox for sorting... I've resolved the partitioning algorithm issues btw. Went back to basics and wrote the "kiwidude" version, undoubtedly not as pretty but it works pretty darn quick, sub-second searches of large libraries again regardless of the number of duplicates. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Duplicate Detection | Philosopher | Library Management | 114 | 09-08-2022 07:03 PM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |
Duplicate Detection | albill | Calibre | 2 | 10-26-2010 02:21 PM |
New Plugin Type Idea: Library Plugin | cgranade | Plugins | 3 | 09-15-2010 12:11 PM |
Help with Chapter detection | ubergeeksov | Calibre | 0 | 09-02-2010 04:56 AM |