Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 04-25-2011, 05:16 AM   #151
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
@chaley - actually I think I found another issue with that algorithm. I don't it actually "works".

For instance if I set
dups = [(3,4),(3,5)]
initial_dups=[1,2,3,4,5,6]

The results it gives me are:
[1,2,3,4,5],[1,2,4,5,6]

Look at the first group - it has 3 and 4 together. Yet they are specifically exempted from appearing together in a group, and instead 6 has been removed?
The problem you found has to do with how the non-dup sets were built. The code assumed that the sets were complete, but in your example they are not, and so needed to be 'unioned'. Changing the dup_map building code to do that (1 character), the partitioning algorithm produces the right answer.
Code:
books known not to be duplicates [(3, 4), (3, 5)]
candidate duplicates [1, 2, 3, 4, 5, 6]
After partioning [[1, 2, 3, 6], [1, 2, 4, 5, 6]]
That said, the partitioning algorithm I used was stupid. Although you have already done your own and therefore don't need this, for completeness here is a much better one that avoids the highly exponential behavior of the last one. It is still not linear, but I don't think it can be because of the nature of partitioning. This code also contains the one-character change to correctly build the duplicate sets.
Spoiler:
Code:
from collections import defaultdict

# Construct map of books that are not duplicates
dups = [(3,4), (3,5), (3, 6), (7, 8), (8, 9)]
print 'books known not to be duplicates', dups
not_duplicate_of_map = defaultdict(set)
for t in dups:
  s = set(t)
  for b in t:
    not_duplicate_of_map[b] |= s

# Simulate a test
initial_dups = [i for i in xrange(1,10)]
initial_dups.sort()
print 'candidate duplicates', initial_dups

# Initial condition -- the group contains 1 set of all elements
results = [set(initial_dups)]
partitioning_ids = [None]
# Loop through the set of duplicates, checking to see if the entry is in a non-dup set
for one_dup in initial_dups:
    if one_dup in not_duplicate_of_map:
        # The entry is indeed in a non-dup set. We may need to partition
        for i,res in enumerate(results):
            if one_dup in res:
                # This result group contains the item with a non-dup set. If the item
                # was the one that caused this result group to partition in the first place,
                # then we must not partition again or we will make subsets of the group 
                # that split this partition off. Consider a group of (1,2,3,4) and
                # non-dups of [(1,2), (2,3)]. The first partition will give us (1,3,4)
                # and (2,3,4). Later when we discover (2,3), if we partition (2,3,4)
                # again, we will end up with (2,4) and (3,4), but (3,4) is a subset 
                # of (1,3,4). All we need to do is remove 3 from the (2,3,4) partition.
                if one_dup == partitioning_ids[i]:
                    results[i] = (res - not_duplicate_of_map[one_dup]) | set([one_dup])
                    continue
                # Must partition. We already have one partition, the one in our hand.
                # Remove the dups from it, then create new partitions for each of the dups.
                results[i] = (res - not_duplicate_of_map[one_dup]) | set([one_dup])
                for nd in not_duplicate_of_map[one_dup]:
                    # Only partition if the duplicate is larger than the one we are looking
                    # at. This is necessary because the non-dup set map is complete,
                    # map[2] == (2,3), and map[3] == (2,3). We know that when processing
                    # the set for 3, we have already done the work for the element 2.
                    if nd > one_dup and nd in res:
                        results.append((res - not_duplicate_of_map[one_dup]) | set([nd]))
                        partitioning_ids.append(nd)

print 'After partioning'
sr = []
for r in results:
    if len(r) > 1:
        sr.append(sorted(list(r)))
sr.sort()
for r in sr:
    print r
chaley is offline   Reply With Quote
Old 04-25-2011, 05:32 AM   #152
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Hey Charles,

Yeah I eventually figured out the map loading in the test code was suspect after I put my own algorithm in and got unexpected results, haha. Your new version has the same approach I took but with more concise code in that set magic in the middle so I shall steal it verbatim thanks!
kiwidude is offline   Reply With Quote
Advert
Old 04-25-2011, 05:34 AM   #153
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
You have a preference on the gui?

Last edited by kiwidude; 04-25-2011 at 05:41 AM.
kiwidude is offline   Reply With Quote
Old 04-25-2011, 07:23 AM   #154
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
You have a preference on the gui?
I assume this is aimed at me?

Yes. C) None of the above.

I prefer screenshot 2, but changed to something like the following:
Click image for larger version

Name:	Clipboard01.png
Views:	781
Size:	2.6 KB
ID:	70465
I didn't bother to write code to center the radio buttons in the grid.

This example shows 2 things. The first is the use of common column labels, making explicit the coupling implicit in columns. The second is introduction of soundex for author names, which seems reasonable, given that soundex was invented for names.

I prefer the column layout even if you don't want to add soundex for authors. In this case I would have two empty holes in the grid. What I am trying to achieve is common headers.

An alternate that might work better for me is shown below. This one avoids the notion of columns and rows, instead using groups. I think this layout reduces the semantic coupling between the two sets of options caused by the row/column layout.
Click image for larger version

Name:	Clipboard02.png
Views:	908
Size:	3.4 KB
ID:	70464
However, both of your proposals work. I won't be unhappy if you pick either one.

Last edited by chaley; 04-25-2011 at 07:46 AM. Reason: Make last sentence clearer.
chaley is offline   Reply With Quote
Old 04-25-2011, 07:36 AM   #155
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by kiwidude View Post
Quote:
Originally Posted by chaley View Post
An alternate that might work better for me is shown below. This one avoids the notion of columns and rows, instead using groups. I think this layout reduces the semantic coupling between the two sets of options caused by the row/column layout.
I find the two above to be functionally the same.

Kiwidude of the two you presented I do like the one above the best.
DoctorOhh is offline   Reply With Quote
Advert
Old 04-25-2011, 07:51 AM   #156
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Hee, hee.

Ok, attached is your first variation. I do agree with you that the second variation (which is what you suggested ages ago) is probably the better bet though. The only thing I had against it was ending up with a more "vertical" dialog. However thinking about it now it really wouldn't be that much taller.
Click image for larger version

Name:	Screenshot_2_Options.png
Views:	762
Size:	26.2 KB
ID:	70467

We shall see what I end up with when I push 0.6... haha. The plumbing is all done supporting all the various permutations etc, I just need to start tuning the implementations of the algorithms.

@dwanthny - thx and I agree that of the two I originally presented the one you picked was my preference too. However I could be swayed into the second of chaley's suggestions. I just think with the approach I have taken the brain has to do a little more cross-referencing (particularly when stripping out the second row of titles). It is more concise in layout, but perhaps at the expense of ease of use.

Last edited by kiwidude; 04-25-2011 at 07:57 AM.
kiwidude is offline   Reply With Quote
Old 04-25-2011, 02:15 PM   #157
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
v0.6 Beta

Changes in this release:
  • New gui allowing any combination of title and author algorithms
  • Changed similar author algorithm to be more conservative (all initials must match)
  • Added a fuzzy author algorithm which compares using last name and first initial of first name. Ignores common suffixes like Jr, Sr etc
  • Added a fuzzy title algorithm which strips any subtitles, and anything after keywords of "and", "or" and "aka" provided they are not the first word in the title. Pretty aggressive but catches a lot of cases mentioned on this thread
  • Added a soundex algorithm for both titles and authors. This may need some tuning for the length of the soundex but is pretty handy for catching common misspellings in titles
  • Tweaked the "identical" title/author comparisons to ensure casing differences are treated as a match.
  • Added option to sort the results by groups with the most candidates first

For the more technically minded (or interested) you can now find all the algorithms and test code/cases for them in "algorithms.py" in the zip file. You can run this yourself with "calibre-debug -e algorithms.py". So you can see the range of permutations I currently test for and those that I still expect not to be caught.

In terms of the examples posted earlier on this thread, I think all of them can now be found by one algorithm or another, with the exception of this one:
Foundation 5 - Foundation and Earth
Foundation and Earth

It will however find this:
Foundation and Earth - Foundation 5
Foundation and Earth

Of course it is pretty easy to do a sanity check on your library using "title:-" or the Quality Check plugin to detect such cases and fix them before you do your duplicate run.

Look forward to hearing what you think. My todo list with this is now done - with the possible exception of some slightly improved tag browser
Attached Thumbnails
Click image for larger version

Name:	Screenshot_2_Options.png
Views:	734
Size:	35.6 KB
ID:	70497  
Attached Files
File Type: zip Find Duplicates.zip (115.5 KB, 586 views)

Last edited by kiwidude; 04-25-2011 at 04:57 PM. Reason: Updated to 0.6.1
kiwidude is offline   Reply With Quote
Old 04-25-2011, 04:53 PM   #158
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
This plugin is fun to play with.

Problem: exception using title:soundex, Author:ignore. Edit: it happens for all title != ignore.
Code:
calibre, version 0.7.57
ERROR: Unhandled exception: <b>NameError</b>:global name 'not_duplicate_of_map' is not defined

Traceback (most recent call last):
  File "calibre_plugins.find_duplicates.action", line 155, in toolbar_button_clicked
  File "calibre_plugins.find_duplicates.action", line 150, in find_duplicates
  File "calibre_plugins.find_duplicates.duplicates", line 79, in run_duplicate_check
  File "calibre_plugins.find_duplicates.algorithms", line 285, in run_duplicate_check
  File "calibre_plugins.find_duplicates.algorithms", line 321, in convert_candidates_to_groups
  File "calibre_plugins.find_duplicates.algorithms", line 383, in partition_using_exemptions
NameError: global name 'not_duplicate_of_map' is not defined
chaley is offline   Reply With Quote
Old 04-25-2011, 04:58 PM   #159
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Oops - didn't do a very good job of pasting in your code now, did I? lol.

New version updated on the previous post.
kiwidude is offline   Reply With Quote
Old 04-25-2011, 05:13 PM   #160
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
This is really a lot of fun. I tried soundex title, ignore author, and it finds very surprising things. I have a lot of books in French, and it matches these against the English titles (no surprise). I actually found 2 new duplicates. However, I am mystified why "20,000 Leagues under the Sea" matches "L'Assassin du roi".

@kiwidude: this is really good stuff. I suggest that you make it generally available as soon as you are comfortable with doing so.
chaley is offline   Reply With Quote
Old 04-25-2011, 05:14 PM   #161
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
This plugin is fun to play with.
Yes it does have an element of playing the pokies about it - you pick a combination and pull the handle to see what turns up

The soundex is one for sure that may need some refining. I had to tweak the algorithm off that link - it blew up on titles with non ascii characters in the names so now I ignore those. There is also the question of what length to make the soundex - too short and your buckets are too big, too long and it might not be fuzzy enough. As a starting point I chose a length of 6 for titles and 8 for authors but these were relatively arbitrary based on some random sampling.

You could potentially expose this on the duplicate options dialog I guess if you wanted to allow users to tune to their liking? I guess it depends on how much control we want to offer if any.

When soundex is applied to authors, I try to apply to the surname first and then the rest. So if you had "Robert Cross" and "Robert Ludlum" they shouldn't appear together from a soundex match, but "Nora Roberts" and "N. Roberts" would.
kiwidude is offline   Reply With Quote
Old 04-25-2011, 05:30 PM   #162
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Quote:
Originally Posted by chaley View Post
However, I am mystified why "20,000 Leagues under the Sea" matches "L'Assassin du roi".
Haha, I'm not hugely surprised by that one. For a start, anything that isn't a-z gets thrown away. So then you are down to:
leaguesunderthesea
lassassinduroi

The algorithm is designed to discard repeating mapped letters with the same "value" for that character. e and a have the same value, so do s and g. etc. etc - and you end up with a soundex match unless we crank up the soundex length

Quote:
@kiwidude: this is really good stuff. I suggest that you make it generally available as soon as you are comfortable with doing so.
Thanks, I am rather chuffed with how it has turned out, obviously the input from yourself and Starson in particular has been awesome to feed into it. The only thing that I might be tempted to add which I didn't finish my sentence on was improving the tag browser integration.

At the moment when you do an author based search, I expand the authors node in the tag browser and make it visible. The next step is to add to that to take the first author from the group under consideration and ensure that node is visible in the tag browser. Otherwise (when you have a lot of authors) you have all those collapsed author groups and still have to do a bit of hunting to actually get to the author node.

The other aspect to that is would some users get grumpy about the tag browser continually popping into view? Perhaps I should add a checkbox on the search options dialog so users could choose not to use that function (like when working on small screens). Although if they need to rename an author it is the "best" way of doing so though I guess they could do it from the bulk metadata dialog.

The third aspect of that is if I made it an option to then support it for title based searches as well. While it would be slightly less used for renaming authors with those searches it still could be.

If I can figure out those questions (plus whether to offer a soundex spinbox) then will stick it out as a 1.0 plugin.
kiwidude is offline   Reply With Quote
Old 04-26-2011, 04:43 AM   #163
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,447
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by kiwidude View Post
At the moment when you do an author based search, I expand the authors node in the tag browser and make it visible. The next step is to add to that to take the first author from the group under consideration and ensure that node is visible in the tag browser. Otherwise (when you have a lot of authors) you have all those collapsed author groups and still have to do a bit of hunting to actually get to the author node.
Yes, move to the author. Question: should it be boxed? I think so, because it will help make it clear what is happening.
Quote:
The other aspect to that is would some users get grumpy about the tag browser continually popping into view? Perhaps I should add a checkbox on the search options dialog so users could choose not to use that function (like when working on small screens).
I do think you need the checkbox. Why? There are 3 kinds of users involved:
1) Those that don't understand the tag browser and can use some help. They won't change the box. Having the browser select the author might help them understand what one can do with the browser, although it will also be another example of mysterious behavior.
2) Those that understand and use the tag browser. The integration will help.
3) Those that understand but do not use the tag browser. Having it pop open would be a curse.

I also wonder about how you do the positioning. Should you just find the node and position on it, or should you enter something into the tag browser search box and trigger a search? My guess is that the former is what you should do, because a) another find won't find anything different, and b) probably the *vast* majority of people have no clue what that box is for and wouldn't notice the text in it.
Quote:
The third aspect of that is if I made it an option to then support it for title based searches as well. While it would be slightly less used for renaming authors with those searches it still could be.
This I am not sure of. If I am thinking 'titles' but using author to help filter, then having the tag browser select the author could be an annoyance. If I am thinking 'authors' but using the title as a filter, then it is a help. Of course, you can't know what I am thinking.

My guess is that an option is required. If title is set to Ignore, the option defaults 'on'. If title is set to anything else, the option defaults 'off'.

Do you remember the settings of these options? The tab browser checkbox would be a good candidate for remembering. If someone doesn't want it, s/he probably *really* doesn't want it.
chaley is offline   Reply With Quote
Old 04-26-2011, 05:41 AM   #164
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Hi Charles, thx for your feedback again.

Yeah I was thinking of just boxing the node to ensure it is visible, no searches. In a similar way to what my user category plugin does. So it would be a call to "self.gui.tags_view.model().find_item_node()" then "show_item_at_path()" or something like that.

All settings are persisted in the config file as soon as the user clicks ok in the dialog so yes everything is "remembered".

Sounds like we agree the user needs an option to disable the tag browser integration. It is just how that would interoperate with changes to the type of title search that I'm not clear about.

The simplest option would be a single checkbox that says "Show the first author in each group in the tag browser". This would take effect regardless of the type of search.

The second option would be to name it "Show the first author in the tag browser for ignore title searches". As per the name, doing anything but an author duplicate search would never manipulate the tag browser.

The third option would be to offer two checkboxes. One for "ignore title searches" and one for "title searches". We could enable/disable the appropriate checkboxes as the user switches the title match radio buttons so the user knows which is relevant.

I can't think of how a setting which changes each time you click a radio button can sucessfully work in combination with a "remembered" setting without the user wondering wtf is going on?

What did you think of the soundex - shall we let the user tune it with spinboxes or not?
kiwidude is offline   Reply With Quote
Old 04-26-2011, 12:11 PM   #165
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,730
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
After more thought I propose the following:

My original requirement was the most likely scenario of a user doing an ignore title based search (so looking for duplicate authors) and on finding a group wanting to rename the authors involved. Forget about tag browser authors for title based searches. Yes it is "possible" a duplicate author could be found that way but the workflow I would always recommend is to do your "ignore title" searches first which should ensure any author renaming is already handled. Then focus on looking for various title matches with identical authors.

So... a checkbox related to just author based searches should do the trick of "Highlight authors in tag browser for ignore title searches".

I then decided to draw boxes around *all* the authors in the current group in the tag browser. I think that works REALLY well to give visual focus to just the authors under consideration for this group. Obviously it doesn't apply to when viewing one group at a time, as all the authors would be boxed. Though I think it appropriate to have the tag browser visible and authors node expanded ready .

One decision left... should I offer people want the ability to tune the fuzziness of their soundex matches?
Attached Thumbnails
Click image for larger version

Name:	Highlight2.png
Views:	769
Size:	63.4 KB
ID:	70523  
kiwidude is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Duplicate Detection Philosopher Library Management 114 09-08-2022 07:03 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM
Duplicate Detection albill Calibre 2 10-26-2010 02:21 PM
New Plugin Type Idea: Library Plugin cgranade Plugins 3 09-15-2010 12:11 PM
Help with Chapter detection ubergeeksov Calibre 0 09-02-2010 04:56 AM


All times are GMT -4. The time now is 08:18 AM.


MobileRead.com is a privately owned, operated and funded community.