|
View Poll Results: Do you want sorting as described in the first post? | |||
Yes | 5 | 23.81% | |
No | 6 | 28.57% | |
Don't care | 10 | 47.62% | |
Voters: 21. You may not vote on this poll |
|
Thread Tools | Search this Thread |
12-03-2010, 10:29 AM | #16 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
It's not obvious, but sorting of author and title is almost completely independent of what's in the author and title field. There are actually three sort fields: two author sort fields (one that sorts authors and one that sorts books by author) and one title sort field (for sorting books by title). They can be completely different from whatever is in the corresponding author/title fields.
By default, the sort fields are hidden. I suspect one could use Search and Replace to modify them by changing accented chars to whatever is desired to control sort order. (I'm not sure if S&R has access to the title sort field, but I imagine Charles would add it on request if it doesn't.) Last edited by Starson17; 12-03-2010 at 10:33 AM. |
12-03-2010, 10:36 AM | #17 | ||
Grand Sorcerer
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Quote:
The python code to handle sorting is under the spoiler. Most of the work to generate a value is done in the SortKeyGenerator class. Ignoring the python magic involving iterators and multiple fields, you will see how various types are handled differently. Series and dates are good examples. The multisort function actually does the work. In particular, the call to sort() does the sorting (imagine that). It uses the sort key generator to build a sort key per record, a process that is done once when the record is first touched. The python sort routine compares these keys using a strict value-based collation. So, to do collated sorts, the strings must be changed so that strict value-based collation generates the right result. Therefore the sort key generator would need to do character equivalence mapping on the strings when generating the key. That would happen inside the 'dt in ['text' ...]' if block. The trick is to do the scanning and mapping in a way that is sufficiently performant, but sufficiently customizable. I could imagine using a translate table, but I haven't thought much about it. Spoiler:
|
||
Advert | |
|
12-03-2010, 11:11 AM | #18 |
creator of calibre
Posts: 43,859
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
This is fairly simple to do. What you have to implement is a python C extension that defines a cmp function that compares two unicode objects. Given that integrating it into calibre would be trivial.
In psedo-code the thing would look like: Code:
function set_collation(lang_code) { ... } function cmp(a, b) { return a - b; //Where a -b means return -1 if a < b, +i if a > b and 0 if a==b } |
12-03-2010, 11:16 AM | #19 |
Grand Sorcerer
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@Kovid: I am working on a tweak (almost done) that will permit user-specified character translation. The tweak would be used in sort key gen.
My test translating 'é' to 'e' and ç' to 'c' works fine, and there is almost zero performance cost. I will submit the tweak shortly. |
12-06-2010, 01:16 AM | #20 |
Grand Sorcerer
Posts: 11,742
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@Everyone: Kovid integrated the relevant parts of the IBM ICU (International Components for Unicode) into calibre, after which I modified sorting and case conversion functions to use them. Sorting now works properly.
By default, calibre uses the locale specified by its language. A tweak has been provided to override that locale with a different one. See under the spoiler for the tweak documentation. I tested the example provided by Man Eating Duck, sorting names beginning with a and å. With the locale set to 'en', the letters are equivalent. With the locale set to 'nb' (norway), they å sorts after z. Tweak documentation: Spoiler:
|
Advert | |
|
12-11-2010, 07:14 AM | #21 |
Connoisseur
Posts: 67
Karma: 40
Join Date: Aug 2010
Device: iPad, Kindle Paperwhite
|
Although I'll have to wait until Monday to upgrade and check the new feature on my books I couldn't wait to say that I am thankful , it's amazing how fast calibre devs keeps adding features and fixing things.
|
Tags |
accent, sorting |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Accented characters on PRS-505 | gandalfbp | Calibre | 4 | 04-19-2010 07:48 AM |
PRS-600 any way to type spanish accented characters? | arielinflux | Sony Reader | 1 | 03-17-2010 04:22 AM |
Foreign accented characters and libprs500 | Stingo | Calibre | 6 | 02-24-2008 07:51 PM |
PRS-500 Accented characters onto reader using Mac | squiggle8 | Sony Reader Dev Corner | 9 | 12-06-2007 04:01 PM |
Accented characters | bingle | Sony Reader | 7 | 07-25-2007 06:36 AM |