Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 04-15-2011, 09:52 AM   #61
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Just a short note to say I finally had time to run against my full library (16K books). It's great, and definitely should go into the trunk when it's done.

It makes me wonder if automerge should be changed or coordinated with Find Duplicates when it's available to all:

1) Perhaps remove it from "Copy to Library" or make it optional. I was always a bit leery of including it there. It's one thing to automerge new entries for the Calibre Library, but it's different to automerge entries that are already in one library and are merely being copied into another library. I keep worrying that I'll get a post asking why the new library only has 99 new entries when 100 were selected and copied. I was partly swayed into adding it because it provided a way to do automatic duplicate detection on existing entries.

2) Perhaps mirror the multiple "fuzzy match" options in Find Duplicates into automerge.

3) Remove it entirely? Keep Kovid's original warning about duplicate titles, and offer to run Find Duplicates if duplicates are added. The user can always Merge those he wants, or mark them as not duplicates.

Any thoughts?

Personally, I think I need to do more playing with Find Duplicates, but what I've seen on my real data is great!
Starson17 is offline   Reply With Quote
Old 04-15-2011, 10:35 AM   #62
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Hey Starson17, sounds like you had a pretty positive experience which is great to hear. I am sure you will like the next version even more with making that search restriction/highlighting/sort stuff all happen automatically.

It is an interesting comment about automerge. For myself I would be sad to see it removed completely. I've never been a fan of the "match on title only" default implementation so to have to go back to that would be a step backwards imho.

Re Copy to Library. What would worry me about automerge with this would be the situation of doing "copy and delete", then find that actually it never made it to the destination because of an automerge setting. I don't have an immensely strong opinion on it because I can't see myself being in a scenario of using it I'm afraid. I use Copy to Library a lot but I am migrating unique sets of books author by author so unless I screwed up by independently adding a book to my newer target library I won't hit that situation.

As for the algorithms, that's another interesting question. I can see why you are bringing this up . For myself, I like the fairly conservative approach that automerge takes, and know that "worst case" I will end up with some "duplicates" from a slight variation of author name or whatever. So I don't think you would want to offer "Similar Name, Similar Author" as an option in case it was a bit too aggressive? At least once this functionality is put into Calibre as a user you will know that at any time you can periodically check to see what duplicates you have (or do so after adding a bunch of new books).

For new formats of a book, having automerge automatically sort that out for me is brilliant and I don't want to lose that. For duplicate formats of a book, that is where (personally) I will be wanting to be creating new book records and manually reviewing by comparing the EPUBs side by side or whatever before making my merge decision. I think even if you made no changes to automerge it does everything I would like from it, but that is just my opinion/needs of course
kiwidude is offline   Reply With Quote
Advert
Old 04-15-2011, 11:21 AM   #63
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kiwidude View Post
Hey Starson17, sounds like you had a pretty positive experience which is great to hear.
I did. I've wanted it for a while, and I don't know what took you so long to get your butt in gear.

Quote:
It is an interesting comment about automerge. For myself I would be sad to see it removed completely. I've never been a fan of the "match on title only" default implementation so to have to go back to that would be a step backwards imho.
Yes, I didn't like the default much either, and that was what made me write automerge.

Quote:
Re Copy to Library. What would worry me about automerge with this would be the situation of doing "copy and delete", then find that actually it never made it to the destination because of an automerge setting.
You do realize that automerge settings already apply to CTL, don't you? Your scenario is exactly the type of thing that worries me, even though I did agree to make automerge apply to CTL.

Quote:
I use Copy to Library a lot but I am migrating unique sets of books author by author so unless I screwed up by independently adding a book to my newer target library I won't hit that situation.
If you CTL two books that have the same author and fuzzy matched titles, with automerge option on, you'll get the action that the automerge settings command. It will try to merge formats into a single record. CTL isn't just: "Copy these records unchanged into the new library." If you want that function for CTL, Automerge needs to be off.
Quote:
As for the algorithms, that's another interesting question. I can see why you are bringing this up . For myself, I like the fairly conservative approach that automerge takes, and know that "worst case" I will end up with some "duplicates" from a slight variation of author name or whatever. So I don't think you would want to offer "Similar Name, Similar Author" as an option in case it was a bit too aggressive?
That's why it's not in there now. I suppose it's probably a bad idea for automated merging.
Quote:
At least once this functionality is put into Calibre as a user you will know that at any time you can periodically check to see what duplicates you have (or do so after adding a bunch of new books).
Exactly. And it's also why I'm rethinking the application of Automerge to CTL. Its current behavior has always seemed a bit subtle and slightly dangerous. It was useful as the only automatic way to find duplicates, but that aspect of its usefulness is about to be superseded by Find Duplicates.
Starson17 is offline   Reply With Quote
Old 04-15-2011, 11:41 AM   #64
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by Starson17 View Post
IYou do realize that automerge settings already apply to CTL, don't you? Your scenario is exactly the type of thing that worries me, even though I did agree to make automerge apply to CTL.
Yeah I did know it is in there now, I'm just agreeing that it is the most dangerous aspect of it being in there. At least when you add books from a folder, when you get that dialog telling you a book got merged if that surprises you then you have a chance to try to rectify it (well unless you had "overwrite" turned on).
Quote:
And it's also why I'm rethinking the application of Automerge to CTL. Its current behavior has always seemed a bit subtle and slightly dangerous. It was useful as the only automatic way to find duplicates, but that aspect of its usefulness is about to be superseded by Find Duplicates.
I guess the question is what behaviour will there be as an alternative? If I have a MOBI in library A and an EPUB in library B, then CTL with automerge is lovely jubbly. It's the overwrite/discard duplicate formats that are the "dangerous" ones imho. I guess we have to balance the complexity too if we try to get too funky with it

I guess there are other ideas I could throw into the mix as random thoughts. Like prompting the user during CTL when it detects there are duplicates and asking the user what to do (defaulting to your automerge settings but at least reminding the user what they currently are before doing damage).

When CTL/automerge creates a new book record from a duplicate, what date does it give it? The date of creating that duplicate new book, or the date it was imported into the original library? Just curious how "difficult" it would be for a user using "Find Duplicates" to identify which record is their original and which was from the CTL action.
kiwidude is offline   Reply With Quote
Old 04-15-2011, 12:10 PM   #65
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,703
Karma: 6658935
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
@kiwidude: both the multisort and positionAtCenter changes are in trunk, and will be in today's release.
chaley is offline   Reply With Quote
Advert
Old 04-15-2011, 12:25 PM   #66
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kiwidude View Post
When CTL/automerge creates a new book record from a duplicate, what date does it give it?
I just checked. It appears that the newer Automerge options (which include creation of a new record for dupe formats) do not apply. You either get the old "ignore dupe formats" if Automerge is on, or "copy record unchanged with no dupe warning" if Automerge is off. Perhaps I never integrated the new features to CTL. I know they are in different parts of the code and I was waiting for Kovid's input on the new Automerge code before adding it into CTL. This should be fixed, one way or the other.

If Automerge is off, the old date arrives unchanged.

(Sorry for hijacking your Find Duplicates thread a bit.)
Starson17 is offline   Reply With Quote
Old 04-15-2011, 12:42 PM   #67
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
@kiwidude: both the multisort and positionAtCenter changes are in trunk, and will be in today's release.
Yay. And working wonderfully they are too

Thanks for all your Calibre changes supporting this plugin.

Here's the new look find options dialog.
Attached Thumbnails
Click image for larger version

Name:	Screenshot_2_Options.png
Views:	323
Size:	28.8 KB
ID:	69957  
kiwidude is offline   Reply With Quote
Old 04-15-2011, 12:57 PM   #68
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,703
Karma: 6658935
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
Here's the new look find options dialog.
I clicked on the OK button of this dialog box and got an error. Whats wrong?

On a more serious note: you are welcome, I like the dialog, and thank you (!) for taking this on.

As for the discussion of automerge vs this plugin (built-in eventually), I fall into the 'leave automerge off and check later' camp. Being naturally paranoid, I almost never enable options that automatically combine things. Merging from the results of this plugin are fine with me. Even better would be a merge that tells me what the result will look like before it does it, showing me which formats and metadata end up where, and what (if anything) will be deleted.
chaley is offline   Reply With Quote
Old 04-15-2011, 01:15 PM   #69
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
I have installed the very first version of plugin.
Then I installed v. 0.2.0
kiwidude says that you have to remove "Find Duplicates.json" file.
https://www.mobileread.com/forums/sho...5&postcount=33

I couldn't find the file (and was lazy to use "search" tool on my filesystem, because I have a few large testing Calibre libraries).
So I didn't remove Find Duplicates.json
Bad Things (TM) started to happen. Like crash of Calibre.

So, if you run on Linux, like I do, go to the ~/.config/calibre/plugins and remove "Find Duplicates.json" file before installing v. 0.2.0. You can also remove file *after* Calibre starts to crash on startup. Like I did ;-)
kacir is offline   Reply With Quote
Old 04-15-2011, 01:20 PM   #70
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
Even better would be a merge that tells me what the result will look like before it does it, showing me which formats and metadata end up where, and what (if anything) will be deleted.
That's a good point that I hadn't brought up on this thread, and since we have diverged a little with discussing automerge we may as well also touch on merge as well. Finding the duplicates is just the first step in the users process. Resolving the merges as Charles has said is another area that could have some TLC wrapped around it.

For instance, I would like to make it a little easier to handle the whole issue of things like:
Book 1 has EPUB, MOBI
Book 2 has EPUB, PDF
Now Book 1 might be your "master" record that you prefer the metadata on. But book 2 has a better EPUB. But you need to open both the EPUB for book 1 and book 2 to even find that out. Then delete the format of EPUB from book 1, then do a merge of Book 2 into book 1. And that is all after you have run the cursor up and down between the books to even see which formats overlap.

I'm sure there must be a nicer way to help reduce the steps involved in that with a nice merge gui...
kiwidude is offline   Reply With Quote
Old 04-15-2011, 04:36 PM   #71
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kiwidude View Post
I'm sure there must be a nicer way to help reduce the steps involved in that with a nice merge gui...
The Find Duplicates is certainly getting a nice gui! Feel free to tackle Merge. You won't be stepping on my toes.

Automerge started as a simple: OMG, I'm never going to finish getting all my ebooks into Calibre. I had too many duplicates from global conversion to a new format. It met my personal needs, but it's showing its age as Calibre forges ahead. I traded control for quick and dirty. I really couldn't face opening multiple files, trying to decide which had better formatting, etc. I decided I was just going to keep originals, fix any formatting I didn't like, and go back to the originals if the formatting was so bad I couldn't fix it.

I agree with kiwidude that a better Merge interface is needed (but I want it optional so it doesn't interfere or slow down the keyboard based "M" Merge (delete others) and Alt-M Safe Merge (keep others), which I use constantly). I also think we need more control in the Automerge for CTL. (I kind of like Automerge now for direct Add Books). Unfortunately, it's not going to happen soon for me. I've got a lot on my plate for the next few months.
Starson17 is offline   Reply With Quote
Old 04-15-2011, 07:48 PM   #72
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
v0.3 Beta

Ok folks, here's the latest. This has one notable omission - the new "Manage exemptions for book" dialog that Charles and I were posting about earlier. I know what needs to be done, just haven't had time and wanted some feedback on a huge number of other changes/additions.

This version will need Calibre 0.7.55 (released today).

Main changes:
  • The new UI dialog I posted a screenshot of above, giving you the option to view the results one at a time or all at once
  • Viewing all results at once turns on highlighting, sorting and search restriction
  • Result groups are sorted by their "fuzzy key"
  • Display on screen is sorted by title within marked group - wording just for Charles
  • Show all exemptions applies a search restriction to just those records
  • Marking a group or groups as exemptions has a confirmation dialog where the details pane shows you all the exemptions it will add
  • Likewise the remove exemptions dialog shows you the restrictions that would be removed based on your selection
  • Added a 'Clear duplicate results' menu item for exiting either duplicate groups or duplicate exemptions
  • Placeholder menu for the 'Manage exemptions for this book' view
  • You can show exemptions then hit next result and get returned to the appropriate duplicate groups view
  • All menu items enabled/disabled based on state/selections
  • Remembers any search restriction/highlighting mode you had before you started your search for duplicates and restores it when finished
  • Various other bug-fixes, keyboard shortcut placeholders, some icons etc

Definitely been an interesting plugin to work on - albeit utterly consuming my week. Once I get the manage screen sorted then other than any further suggestions that come up I think it is close to done. I would like to release it as a standalone plugin for a week or two before looking at merging it into Calibre - both to let it get thrashed a bit and to give me a break.

Other outstanding possible ideas for it I have had:
  • Adding an additional "fuzzier" algorithm. Particularly for the author side, maybe something that checks only surname. So you could catch "S Meyer" versus "S.L. Meyer" versus "Stephanie Meyer" etc. which "Similar author" does not. I don't know what to call it though - "similar title, author surname" perhaps.
  • In a similar vein there could be a fuzzier title option. One that would strip off stuff inside brackets to get rid of things like (Omnibus), (2010) etc. Maybe also anything after hyphens to catch where the filename regex brought series info into the title.
  • There is a thought now in my mind that maybe the user should be allowed to choose the title and author algorithms independently. So they could have any permutation they like (e.g. exact title/fuzziest author, fuzziest title/exact author etc). Sounds fine until you wonder how the heck you fit ISBN into that
  • Enhancing the descriptive text for each algorithm to include some examples on the RHS of books that would be matched versus ones that would not.
  • Perhaps a star rating system for the algorithm to give a visual ranking of how strong the matching logic is. Though really a ranking system to be truly useful has to be on the match itself and we aren't going there.

Enjoy, look forward to your feedback as always...

Last edited by kiwidude; 04-19-2011 at 03:57 AM. Reason: Removed attachment as later version on thread
kiwidude is offline   Reply With Quote
Old 04-16-2011, 04:29 AM   #73
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,703
Karma: 6658935
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Really getting there. This version works very well.

Some comments, none of which is very important:

- Did "find dups". It found the expected 4 groups. I switched to "show non_dups", selected them all, and said to remove exemptions. It did. I then pressed "find dups", expecting to see the now 6 groups. I saw the original 4. I suggest that pushing the find dups button after removing exemptions should redo the alg.

- Same as above, but pressed manually selected find dups. Saw all six, but the sort order is wrong. The two new groups are sorted at the bottom. This is probably caused by the multisort thinking it has already sorted the screen. I suggest that you force a multisort (don't set the ignore flag) after every find_dups.

- The 'are you sure' messages should offer the checkbox to not show the message again. In particular, the mark exempt dialog should have this checkbox. The confirm method in gui2.dialogs.confirm_delete.py should be able to do the job.

- A hygiene-factor thing. When the user selects 'mark group exempt', it might be a good idea to check if any books not in the group are selected. If they are, then the user is probably confused and thinks that the selection will be used in lieu of the group. (Guess who almost did that ... )

- Idea: if you connect to gui.search.cleared, you will be notified if the user clicks the clear button on the search bar. That would permit clearing the search to do a "clear duplicate results". Should it?
chaley is offline   Reply With Quote
Old 04-16-2011, 05:19 AM   #74
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,601
Karma: 2092290
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
- Did "find dups". It found the expected 4 groups. I switched to "show non_dups", selected them all, and said to remove exemptions. It did. I then pressed "find dups", expecting to see the now 6 groups. I saw the original 4. I suggest that pushing the find dups button after removing exemptions should redo the alg.
That's a good point. Should it present the whole UI search dialog again, or just re-run the duplicate search in the background (I'm thinking the latter).
Quote:
- Same as above, but pressed manually selected find dups. Saw all six, but the sort order is wrong. The two new groups are sorted at the bottom. This is probably caused by the multisort thinking it has already sorted the screen. I suggest that you force a multisort (don't set the ignore flag) after every find_dups.
Will fix this.
Quote:
- The 'are you sure' messages should offer the checkbox to not show the message again. In particular, the mark exempt dialog should have this checkbox. The confirm method in gui2.dialogs.confirm_delete.py should be able to do the job.
Thanks, will take a look into that.
Quote:
- A hygiene-factor thing. When the user selects 'mark group exempt', it might be a good idea to check if any books not in the group are selected. If they are, then the user is probably confused and thinks that the selection will be used in lieu of the group. (Guess who almost did that ... )
Funnily enough I had started coding exactly that and then I ripped it out to see what people thought. That you have come up with it tells me it should be back in. The reason I removed it was I couldn't decide what to do if the selections don't match. For instance, should I require that only selected rows be on the current matching group? What happens if they intersect with another group? And do I just simply tell the user they are not on the right group and wait for them to select the right one, or do I warn them with a question dialog and then let them go ahead if they say yes?
Quote:
- Idea: if you connect to gui.search.cleared, you will be notified if the user clicks the clear button on the search bar. That would permit clearing the search to do a "clear duplicate results". Should it?
Sounds a good idea to me, I'll add that too, thanks.

I've been thinking more about the title/author independent algorithm thing, the more I think about it the more I am tempted. I would like to see two sliders with labels on the tickmarks (which sadly Qt cannot do out of the box). The slider would have a range values something like:
"Identical", "Similar", "Vaguely Similar", "Ignore"

One slider for each of title and author. If you set both title and author to "Ignore", then it does an ISBN match. A descriptive text box would summarise the combination you had selected a little bit like it does now.

A first time user would get it set to "Identical Title", "Identical Author". The "Vaguely Similar" (or "Fuzzy" or some better name!) author and title selections would do the fuzzier algorithms I suggested above.

Any thoughts? I'm just concerned about the permutations - break them apart and the problem goes away.
kiwidude is offline   Reply With Quote
Old 04-16-2011, 05:49 AM   #75
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,703
Karma: 6658935
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
Should it present the whole UI search dialog again, or just re-run the duplicate search in the background (I'm thinking the latter).
I agree -- run the search again.
Quote:
Funnily enough I had started coding exactly that and then I ripped it out to see what people thought. That you have come up with it tells me it should be back in. The reason I removed it was I couldn't decide what to do if the selections don't match. For instance, should I require that only selected rows be on the current matching group? What happens if they intersect with another group? And do I just simply tell the user they are not on the right group and wait for them to select the right one, or do I warn them with a question dialog and then let them go ahead if they say yes?
Given that the operation adds exemptions for all the books in a group, selections really don't have any meaning. So, unless you are intending to allow subsets of the group (are you?), then I think it is sufficient to pop up a question box to tell the user that exemptions will be added for all the books in the group and the selections will be ignored -- OK? If the user is confused, then I hope s/he pushes cancel, re-locates the group, selects nothing, and does it again. Of course, you must tolerate the first book of a group being selected, or (probably better) any one book in the group.
Quote:
I've been thinking more about the title/author independent algorithm thing, the more I think about it the more I am tempted. I would like to see two sliders with labels on the tickmarks (which sadly Qt cannot do out of the box). The slider would have a range values something like:
"Identical", "Similar", "Vaguely Similar", "Ignore"

One slider for each of title and author. If you set both title and author to "Ignore", then it does an ISBN match. A descriptive text box would summarise the combination you had selected a little bit like it does now.

A first time user would get it set to "Identical Title", "Identical Author". The "Vaguely Similar" (or "Fuzzy" or some better name!) author and title selections would do the fuzzier algorithms I suggested above.

Any thoughts? I'm just concerned about the permutations - break them apart and the problem goes away.
Although this sounds very cool, I suggest that you put it aside for the moment. Getting more feedback should be a priority at this point. I *know* I am not a normal user, and I suspect that neither Kacir or Starson17 are either.

Putting aside the above concern, I am not convinced that sliders are the right interface. They imply a level of 'analog' behavior that isn't there, and also don't support tool tips and the like well. I would lean toward radio buttons, with two groups. Group 1 would have ISBN, then the title choices, with the first choice being ignore. Group 2 would have the author choices with the first choice being 'ignore', which would line up horizontally with the title group's ignore (nothing beside the ISBN choice). Choosing ISBN would force group 2 to ignore and disable it. Choosing any title option would enable group 2. Choosing ignore for both options can be an error, or can make one big group.
chaley is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Duplicate Detection Philosopher Library Management 114 09-08-2022 07:03 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM
Duplicate Detection albill Calibre 2 10-26-2010 02:21 PM
New Plugin Type Idea: Library Plugin cgranade Plugins 3 09-15-2010 12:11 PM
Help with Chapter detection ubergeeksov Calibre 0 09-02-2010 04:56 AM


All times are GMT -4. The time now is 10:22 AM.


MobileRead.com is a privately owned, operated and funded community.