Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 04-24-2011, 01:59 PM   #136
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by drMerry View Post
I got two other options I would like to have.
Identical Title, Similar Author
Identical Title, Identical Author

This would be a nice one for some quick searches
@kiwidude, looks more and more like you were right when you suggested the sliders.
chaley is offline   Reply With Quote
Old 04-24-2011, 02:12 PM   #137
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
One thing that has bitten me is using the 'in' operator on lists. The operator does a linear search! One piece of code I wrote improved in performance by two orders of magnitude when I changed the list to a set, which does hashed lookups. Sometimes I use a dict with a fixed value (e.g. True) for the same thing, because they are hashed as well.
Ahh, that is interesting. I had wondered why I had seen set being used so much in Calibre code, in particular in places where I knew the items being added were distinct so the "obvious" advantage of a set didn't stand out.

I've just sat down to take a look into this. You might be interested in the problem code actually - as it is based on your example from your post here a while ago.

If you take your example and change your "initial_dups = [2, 3, 66, 7, 10, 11, 12]" to "initial_dups = [i for i in xrange(1,100)]" you will see it runs forever...

I'm sure there are optimisations that can be done to it, but it doesn't seem to scale well in its current guise.
kiwidude is offline   Reply With Quote
Advert
Old 04-24-2011, 02:14 PM   #138
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Quote:
Originally Posted by chaley View Post
@kiwidude, looks more and more like you were right when you suggested the sliders.
True

Or, radio buttons, but sliders look nicer off course

identical - similar - sound-ex - ignore

EDIT:

And:
mark - highlight - group-column - number-group-column

This could add 2 columns to the view 1 with groupnumers OR one with number/groupnumber
This would be a nice sorting option

For example you could have 2 groups. 1 with 2 dups 1 with 3 dups. You could then sort on numbers in a group. A number-group-column would look like this in previous case

number-group-column
2.1
2.1
3.2
3.2
3.2

rather than sorting on group, you could sort on number of dups. While the groupnumber is in the column to, you will be able to sort on number of dups keeping the groups together.

Last edited by drMerry; 04-24-2011 at 02:21 PM. Reason: added sorting option suggestion
drMerry is offline   Reply With Quote
Old 04-24-2011, 02:15 PM   #139
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
@kiwidude, looks more and more like you were right when you suggested the sliders.
Haha, yeah. Just when I had the gui looking all pretty too. I had at least already partially refactored the code in anticipation of something like this. Back to the drawing board for the gui again...
kiwidude is offline   Reply With Quote
Old 04-24-2011, 02:36 PM   #140
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
And some other ideas.

When on group-mode (non-highlight) 2 options:
Merg current group
Merg all groups

When on highlight mode 3 options:
* Merge all groups
* Automatically change selected group to current selected book (this could be difficult because you have to deal with multiple selected books)
* Automatically remove merged group out of view
drMerry is offline   Reply With Quote
Advert
Old 04-24-2011, 02:37 PM   #141
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
@chaley - actually I think I found another issue with that algorithm. I don't it actually "works".

For instance if I set
dups = [(3,4),(3,5)]
initial_dups=[1,2,3,4,5,6]

The results it gives me are:
[1,2,3,4,5],[1,2,4,5,6]

Look at the first group - it has 3 and 4 together. Yet they are specifically exempted from appearing together in a group, and instead 6 has been removed?
kiwidude is offline   Reply With Quote
Old 04-24-2011, 02:48 PM   #142
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
@drMerry - thx for adding suggestions.

I've worked very hard to avoid adding custom columns as they add a whole layer of complexity and issues that is best avoided if possible. At the moment the groups are sorted in an "alphabetical" way. I would rather just offer a suboption on the find duplicates options gui to let you sort them by # duplicates (and alphabetically within that). The question is - do others find that of interest or is it niche?

In terms of the merge menu options, it should already automatically remove a merged group out of the view when you go "next group" - is that not happening? So the only thing from your list that I see missing is "Change selected group to selected book". The problem with that is a book can be in multiple groups, so how would it know which group you meant?

For instance if I have books 1,2,3 and exemption (2,3) then I will have group A of (1,2) and group B of (1,3). So if I selected book 1, how would I know you wanted to see Group A or Group B?

Last edited by kiwidude; 04-24-2011 at 02:50 PM.
kiwidude is offline   Reply With Quote
Old 04-24-2011, 04:22 PM   #143
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
The sorting-option is the only thing I need. If it can be done simple without a column, it's fine (great) to me.

Next group is not an option I used at this moment. I use the mark option so had no need to hit next.
I will try it.

Your last problem.
If books 1 and 2 are duplicates and 1 and 3 are duplicates. 2 and 3 will be (just like 3 and 1, 3 and 2 and 2 and 1) Isn't it?
That's why I think it is no problem.
Also, if you would like to sort on number of duplicates, where would you put 1 in your case?

EDIT:
It does remove with ctrl+\

Last edited by drMerry; 04-24-2011 at 04:26 PM.
drMerry is offline   Reply With Quote
Old 04-24-2011, 04:46 PM   #144
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Quote:
Originally Posted by kiwidude View Post
So the only thing from your list that I see missing is "Change selected group to selected book".
And the auto-merge function.
In my case it is less work to mark books not to merge than to select each group and than merge them.

It would be nice to have an option that could do (one of):
* merge all groups (separate off-course)
* merge current group
* merge all but current group
* merge all groups - only if type is different (this would merge pdf and epub for example, but not a epub,pdf and pdf)
drMerry is offline   Reply With Quote
Old 04-24-2011, 04:55 PM   #145
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by drMerry View Post
Your last problem.
If books 1 and 2 are duplicates and 1 and 3 are duplicates. 2 and 3 will be (just like 3 and 1, 3 and 2 and 2 and 1) Isn't it?
That's why I think it is no problem.
This was the subject of much discussion on the original thread referenced in the first post of this one, to do with the transitivity of groups.

Say in your library you have these two books:
#2 Foo (Omnibus)
#3 Foo

Now you run a duplicates check using similar title. It reports these books, as it will (in future) strip off stuff like (Omnibus). However you decide they are not duplicates of each other, so you mark the group as exempt.

Then at some point you add another book to your library (or maybe you changed the name for an existing book).
#1 Foo: Returns

When you run the duplicate check again, in the first pass it sees 1,2,3 as all being in the same duplicate group. However you have already said that 2 and 3 are not duplicates of each other. Nothing can be said from that about whether 1 matches 2 or 1 matches 3. So the user must be presented with two groups of the individual pairs.
kiwidude is offline   Reply With Quote
Old 04-24-2011, 05:04 PM   #146
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by drMerry View Post
And the auto-merge function.
In my case it is less work to mark books not to merge than to select each group and than merge them.

It would be nice to have an option that could do (one of):
* merge all groups (separate off-course)
* merge current group
* merge all but current group
* merge all groups - only if type is different (this would merge pdf and epub for example, but not a epub,pdf and pdf)
I'm intentionally leaving merging out of scope of this plugin for now. A few pages ago in this thread Starson and I briefly touched on the subject for a couple of posts. I think everyone is agreed that it would be nice to have a more intelligent/flexible gui merge option in Calibre. I don't want to attempt to offer any automerge in this plugin as it is too fraught with complications about who merges into who when you have mixtures of formats and metadata being populated in the books in the group.

Instead I would see a separate menu option on the merge menu which for your selected rows comes up with an "automerge" type suggestion as your starting point in the gui. You would be able to launch the viewer for when you have conflicting formats to decide which you want to keep, decide what to do with the metadata etc. As Starson has said the existing merge options would stil exist, this one would just be additional to give you a gui allowing you to preview what the effect would be and tune it. I have way too many things on my todo list already but it could be done as a standalone plugin by someone as a starting point.
kiwidude is offline   Reply With Quote
Old 04-24-2011, 06:06 PM   #147
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Quote:
Originally Posted by kiwidude View Post
Say in your library you have these two books:
#2 Foo (Omnibus)
#3 Foo
...
#1 Foo: Returns
In this case you just do have a problem sorting on numbers. You can not serve the same book twice in a list this could cause a problem when manual deleting / merging books (without using next group).
So then you have to decide to do one of the following I think:

1. Do not show any books marked as not duplicate (thereat it as 'never ask again')
2. Add a popup telling: found new duplicates in hidden books show them?
3. Add a checkbox during start of process: "Search Including Hidden Items (only if new books found)"
4. Add all books in 1 group if new book is found and remove the not duplicate status.

But if this is sorted out, no need to restart the discussion
drMerry is offline   Reply With Quote
Old 04-24-2011, 06:08 PM   #148
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
@drMerry - it is sorted out, provided you work through resolving the groups in order. As a book can only appear highlighted in one group at a time. If you display one group at a time there is no issue.

EDIT: I should add - this is where mucking around with the sort order such as by # of duplicates may cause a user more problems than it is worth.

In the example above, groups A and B of (1,2) and (1,3) will appear next to each other in the gui. So at least as a user you can see all three books displayed next to each other, even though only the first group is highlighted of the first two rows. So you have a chance of anticipating the three way merge and effectively resolving two groups at the same time.

If instead the groups are sorted by the more arbitrary # of duplicates, this may not be the case. I'll be honest and say I would never use # of duplicates as a criteria to sort by - to me what is important is to get genuine duplicates resolved, the # in the group of there being 2 or 10 makes no difference to me. So my inclination is to not offer it as an option (given the subtle issues like I mention above) unless someone provides a compelling reason why it would be valuable.

Last edited by kiwidude; 04-24-2011 at 06:13 PM.
kiwidude is offline   Reply With Quote
Old 04-24-2011, 07:26 PM   #149
drMerry
Addict
drMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmosdrMerry has become one with the cosmos
 
drMerry's Avatar
 
Posts: 293
Karma: 21022
Join Date: Mar 2011
Location: NL
Device: Sony PRS-650
Thank you for your clear explanation.
The reason I would like to use it is because of this:

I get a lot of new books at a time.
I add them all at once.
I often do not have the time to look for all duplicates.
If I have some time, it is nice to do as much as possible, so in that case, I could make a duplicate list and sort the list on number of duplicates. Than I could remove the duplicates that occurs the most. In that case my lib is cleaned the most, I regain the most free space and my database is cleaned also so I can make it smaller and speed up calibre.

That are my reasons to use it this way.
drMerry is offline   Reply With Quote
Old 04-24-2011, 09:24 PM   #150
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,729
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Here is a preview of the new search options dialog. I thought radio buttons might be a little easier to select than sliders which are a mess to line up with lack of native tick label support in Qt.

Another option I am considering on this dialog is to remove the "Title match:" and "Author match:" labels on the left, as per the second screenshot. Any preferences?

The text changes based on your selection obviously. If you choose Ignore Title and Ignore Author then the text tells you that it is an ISBN search.

And just for drMerry there is a checkbox for sorting...

I've resolved the partitioning algorithm issues btw. Went back to basics and wrote the "kiwidude" version, undoubtedly not as pretty but it works pretty darn quick, sub-second searches of large libraries again regardless of the number of duplicates.
Attached Thumbnails
Click image for larger version

Name:	Screenshot_2_Options.png
Views:	665
Size:	35.8 KB
ID:	70453   Click image for larger version

Name:	Screenshot_3_Options.png
Views:	502
Size:	35.8 KB
ID:	70454  
kiwidude is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Duplicate Detection Philosopher Library Management 114 09-08-2022 07:03 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM
Duplicate Detection albill Calibre 2 10-26-2010 02:21 PM
New Plugin Type Idea: Library Plugin cgranade Plugins 3 09-15-2010 12:11 PM
Help with Chapter detection ubergeeksov Calibre 0 09-02-2010 04:56 AM


All times are GMT -4. The time now is 12:41 PM.


MobileRead.com is a privately owned, operated and funded community.