Grand Sorcerer
Posts: 12,467
Karma: 8025600
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Originally Posted by kiwidude
The authors point is an interesting one. One issue which I didn't see you mention of having multiple author cache entries is replication of groups.
|
You are right. Hmmm...
Quote:
I guess the question is whether this is a problem or not. If the user resolves their duplicates in order, the second group if identical would disappear automatically. If they skip through them with highlighting it may jump around a bit but still be valid.
|
Thinking on paper and (I think) agreeing with you: what you are saying is that adding a book to multiple buckets can create a situation where one group is a (possibly improper) subset of another. It seems to me that there isn't much point in showing both groups, at least in author mode. For example, why show a group containing (1,2,3) and another containing (2,3)?
Subsets can be removed rather easily, with performance that should be acceptable if there aren't thousands of groups. Something like this:
Spoiler:
Code:
def clean_dup_groups(dups):
res = [set(d) for d in dups]
res.sort(cmp=lambda x, y: cmp(len(x), len(y)))
ans = []
for i,a in enumerate(res):
for b in res[i+1:]:
if a.issubset(b):
break
else:
ans.append(a)
return ans
dups = [(1,2,3),(4,5)]
print dups
print clean_dup_groups(dups)
print '========================'
dups = [(1,2,3,4,5), (1,6,7)]
print dups
print clean_dup_groups(dups)
print '========================'
dups = [(1,2,3,4,5), (1,6,7), (1,6,7)]
print dups
print clean_dup_groups(dups)
print '========================'
dups = [(1,2,3,4,5), (1,6,7), (3,4), (6,7)]
print dups
print clean_dup_groups(dups)
with output:
[(1, 2, 3), (4, 5)]
[set([4, 5]), set([1, 2, 3])]
========================
[(1, 2, 3, 4, 5), (1, 6, 7)]
[set([1, 6, 7]), set([1, 2, 3, 4, 5])]
========================
[(1, 2, 3, 4, 5), (1, 6, 7), (1, 6, 7)]
[set([1, 6, 7]), set([1, 2, 3, 4, 5])]
========================
[(1, 2, 3, 4, 5), (1, 6, 7), (3, 4), (6, 7)]
[set([1, 6, 7]), set([1, 2, 3, 4, 5])]
Quote:
If they added exemptions using mark all groups it would create some duplication I think but not a major drama.
|
Again, I don't see a reason to keep exemption groups that are subsets of another group. The same set cleanup would fix this, eliminating the subsets.
|