Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 04-18-2011, 10:58 AM   #1
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Thinking about the author_sort tweak ...

As we all know, people post continuously asking about author, author_sort, FN LN, and LN FN. Part of the problem (but by no means all) comes from the fact that the people aren't consistent when they enter authors names. Both forms (FN LN & LN, FN) are used.

The default author_sort algorithm is 'invert'. This means that the author Joe Blogs will produce an author_sort of Blogs, Joe. It also means that the author Blogs, Joe will produce an author_sort of Joe, Blogs,.

My question: should we change the default to 'comma'? This setting handles both FN LN and LN, FN, producing the right value (LN, FN) in both cases. My suspicion is that making this change would dramatically cut down the questions.

The major negative that I see is that using 'comma' will get names like "Joe Blogs, Jr" wrong. Of course, the invert setting also gets it wrong (Jr, Joe Blogs) unless there is no space after the comma.

And yes, I am prepared to implement this.

Comments?

(@kiwidude: I know you want a wizard and the like. This change isn't intended to make your wish go away. My hope is that it will reduce confusion in a reasonable number of cases.)
chaley is offline   Reply With Quote
Old 04-18-2011, 11:07 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I'm ok with it, why not also make comma smarter so it get Joe Blogs, (Jr.|Sr) right?

Something along the lises of if the token after the comma is only two chars long assume its a suffix.
kovidgoyal is offline   Reply With Quote
Advert
Old 04-18-2011, 11:09 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Actually two chars wont work (vietnamese have two char last names). But we can test for Jr/Sr/etc explicitly.
kovidgoyal is offline   Reply With Quote
Old 04-18-2011, 11:30 AM   #4
kiwidude
Calibre Plugins Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,637
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
Ahh, my favourite topic

It sounds like it could reduce some of the support questions which is good.

Putting aside the wizard page discussion, I still think that it would be nice if Calibre more explicitly from a user perspective gave them a preferences option which was to the effect of "Display my author names LN, FN".

I understand it is isn't trivial underneath and it has implications. But as a user it is the level I want to understand.

As a plugin developer I currently have the conundrum with every plugin of needing to have its own configuration option to effectively replicate just that. Because a three-way tweak (which is about setting the value of author sort, not the value of author) doesn't give me what I need.

With Kovid's permission I had to add a pref for metadata download that a user can check, so now we have yet another setting. And I have another one in the Goodreads sync plugin. And I have more logic in plugins like Search the Internet which attempt to detect LN, FN and order it into FN LN because many websites give best results with that. It is all sorts of variations that are workarounds and fudges because Calibre does not have a central setting that everything could look at.
kiwidude is offline   Reply With Quote
Old 04-18-2011, 12:33 PM   #5
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
@kiwidude: I don't agree with you. I want to enter the authors in exactly the way I want to see them and the way I want to sort them. One consistent way is LN FN (Some Asian names), FN LN (most western names), FN de LN (many European 'nobility' names), and the like. There is no single algorithm to convert these to sort values, or even to convert these into LN-first form. For example, these names in LN-first format will be LN FN (the Asian name), LN, FN (the western name), and de LN, FN (the noble name). For the first two, the sort values are the same as the name. For the last one, the sort value is 'LN, FN de'.

@Kovid, I am not convinced that the benefit of adding suffix support is worth the complexity. We would need to introduce another tweak providing the suffixes, because there are a lot of them, for example Jr, Sr, II, III, Second, Third, PhD, MD, JD, and so on. And this is only in English. My feeling is that consistency is important, even if it gets it wrong. (Yes, I know that "A foolish consistency is the hobgoblin of little minds". I accept the label.)
chaley is offline   Reply With Quote
Advert
Old 04-18-2011, 12:41 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Since you're writing the code, you get to decide
kovidgoyal is offline   Reply With Quote
Old 04-18-2011, 01:57 PM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
My question: should we change the default to 'comma'?
...
Comments?
Change default to comma.

I'd add a tweak for explicit Jr., Sr. handling, but as Kovid said - you're the hobgoblin writing the code and you get to be as foolishly consistent as you want
Starson17 is offline   Reply With Quote
Old 04-18-2011, 02:09 PM   #8
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Starson17 View Post
Change default to comma.

I'd add a tweak for explicit Jr., Sr. handling, but as Kovid said - you're the hobgoblin writing the code and you get to be as foolishly consistent as you want
I happily give the code to you.
chaley is offline   Reply With Quote
Old 04-18-2011, 02:40 PM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
I happily give the code to you.
I happily take it if you don't mind me putting it on my stagnated ToDo list. I have 3 recipes, AutoMerge for Copy To Library, Merge with greater control over metadata and simplified book review already on that list. It's nowhere near as long as Kovid's or your list, but I'm a lot slower. Perhaps about when continental drift has reformed Pangaea.
Starson17 is offline   Reply With Quote
Old 04-18-2011, 03:29 PM   #10
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
I'm no developer, but I think changing the default to comma is a very good idea. I just now realized that all the questions could've been answered by just telling the users to put the tweak on comma and let them recalculate the author_sort
As for the changes in code, I agree that consistency is important- as to what happens consistently, that's entirely up to your enthusiasm for the task, Charles
Manichean is offline   Reply With Quote
Old 04-18-2011, 04:36 PM   #11
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Random thoughts, without volunteering to build anything...

@kiwidude, my understanding is that you want to be able to store authors in some known, unambiguous way. One possibility is to 'know' what is the FN and what is the LN.

@me: I don't want to have *one* way of displaying names forced on me, either FN LN or LN, FN.

Resolution: Store 'FN' and 'LN' separately in the authors table. In addition, store a flag indicating that *this author* is to be displayed in one of several formats including "LN FN", "FN LN", and "LN, FN" (these are the three I would use). Also store a default for this field that is applied to new authors. Also store a sort value for the authors, for which the default value is derived from the name.

This satisfies me, because I can control how displayed names are constructed from the stored names. All I need to do is decide how the names are separated into their parts. I think it satisfies kiwidude, because he can store the names retrieved from GOK where into the right parts of the name. If the user is as compulsive as I am about such things, then s/he would use manage_authors to control how a given author is to be displayed. The default for that flag would be set in some preference.

To implement this, we would need to change the authors table to be have separate fields for FN and LN, or perhaps to have an unambiguous encoding of this information. We would also need to change the rest of calibre to display author names according to the ordering flag. This would affect the GUI, the content server, and who-knows what else.

The question behind this post: do I understand the issue? And would the solution, if implemented, solve the problem?
chaley is offline   Reply With Quote
Old 04-18-2011, 04:38 PM   #12
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
I have changed the default to 'comma', and pushed the change.
chaley is offline   Reply With Quote
Old 04-18-2011, 05:06 PM   #13
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
@chaley: Database aspects aside, how would the method you're proposing differ from the behaviour we'd get when using comma as default? If I understand it correctly, it would just replicate that behaviour and thus would probably not be worth doing.
Manichean is offline   Reply With Quote
Old 04-18-2011, 05:34 PM   #14
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Manichean View Post
@chaley: Database aspects aside, how would the method you're proposing differ from the behaviour we'd get when using comma as default? If I understand it correctly, it would just replicate that behaviour and thus would probably not be worth doing.
No, it doesn't replicate current behavior.

We currently have no real idea how a name is separated into its components. All we know (and this is what I think kiwidude was saying) is how to convert a series of words into another series of words. I was attempting to describe was a way to apply meaning (semantics) to the words. Having this meaning, calibre could correctly display names in either order, and could maintain sort values regardless of order.

For example, consider the name Lim Chun, a person in Malaysia that I know. His name is written in the correct order, but in fact Lim is his family name and Chun is his personal name. Now consider me, Charles Haley, which is written in the correct order but my family name is Haley. What I described could handle both of these cases. The first one would have a flag indicating it was to be displayed LN FN. My name would have a flag indicating it was to be displayed FN LN. Now add Werner von Braun to the mix. The LN is "von Braun" and the FN is "Werner", and if displayed in LN FN format should be "von Braun, Werner". However, in this case the sort string is "Braun, Werner von", where in my case it is "Haley, Charles", and in Lim Chun's case it is "Lim, Chun".

The 'comma' algorithm does not deal correctly with all these cases.
chaley is offline   Reply With Quote
Old 04-18-2011, 05:57 PM   #15
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The problem with doing that is we then have to provide UI for users to tag parts of the author names as being LN or FN and how they should be sorted. I dont think this issue is important enough to devote UI to it.

With the status quo, those who care can spend a little effort and get them just right. Those who dont (and I confess I belong to the latter camp) do not have to deal with beyond the absolute minimum.
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Using author_sort, title_sort names meme Development 6 02-23-2011 01:41 PM
ERROR:{author_sort:.2} - {title} mobijupp Calibre 2 01-07-2011 11:17 PM
Little bug in bgcolor for author_sort? Coleccionista Calibre 4 11-12-2010 10:57 AM
Showing author_sort column in main grid LARdT Calibre 3 09-20-2010 03:26 AM
What am I missing? (author_sort) megachirops Calibre 12 09-06-2010 11:15 AM


All times are GMT -4. The time now is 02:11 AM.


MobileRead.com is a privately owned, operated and funded community.