![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 126
Karma: 20236
Join Date: May 2014
Device: Kinde PW v1, Kobo H2O, Onyx Boox T68
|
Change case in a regular expression
Greetings,
Is there a way to capitalize some text with regex in Calibre ? I think I've read somewhere that it was possible in python, but not in Calibre. My goal : I am trying to improve my "save as" modele, trying to extract the name of the author_sort and capitalize it, and extract the firstname. For example : from Christie, Agatha to CHRISTIE Agatha For now, i built 2 personnal fields. It works but it would be better to work with regex... Thanx for your help ![]() |
![]() |
![]() |
![]() |
#2 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
|
Ok, I'm not the best person to be answering because I definitely cannot build the regex expression for you. But lets start the ball rolling with some general Q&A that might inspire others to offer some regex examples.
I think it is possible to perform the regex operation that you want, but it's going to require a complex expression depending on what kind of Lastname situation(s) you want to account for. What I mean is, do you have lastnames that include:
Another issue is where in calibre do you plan to use this expression? In Add Books? Or in Edit Metadata? Somewhere else? Does the ebook source follow a distinct, rigidly enforced pattern? For instance a filename like: Title - Firstname Lastname.Extension. Consistent source material helps during input. PS: Take a look at the topic Tyranosaurus Regex, including my post (msg #17) where the efforts of others came together in a wonderful regex expression which was particularly efficient. I didn't create it, I just stuck it all together (and I got lucky!). There also used to be a Regex example topic, but I'm not seeing it ATM. Last edited by Sabardeyn; 06-24-2014 at 06:47 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Use this, in general program mode. Or tweak into another mode, but general program mode is awesome for complex stuff, like combining multiple fields in one using complex regexes and function calls.
It will only parse the first author. Code:
program: FN=re( field('author_sort'), '([^,]+),.+', '\1' ); LN=re( field('author_sort'), '[^,]+,(.+)', '\1' ); fixed_author_sort=strcat( uppercase(FN), LN ) Last edited by eschwartz; 06-25-2014 at 03:31 AM. |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,336
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
This is an interesting case. In effect, it wants a function that allows a template to be applied to each group matched in the search regular expression, in effect providing similar functionality as python's match groups. This function, having a variable number of arguments, would look something like:
Code:
re_groups(field, search_expr, template_for_group_1, t_for_g_2, ...) For the situation being discussed in this thread and using eschwartz's example, one would use something like Code:
re_groups(field('author_sort'), '([^,]+), (.+)', 'program: uppercase($)', 'program: $') Code:
re_groups(field('author_sort'), '([^,]+), (.+)', "[[$:uppercase()]]", '[[$]]') Last edited by chaley; 06-25-2014 at 03:19 AM. Reason: changed syntax |
![]() |
![]() |
![]() |
#5 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Sounds awesome, chaley!
I gave up pretty quickly looking at that and took the easy way. ![]() |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,336
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
I have it working and will submit it for the next release.
Examples using your suggested approach: Code:
program: re_group(field('authors'), '(\S*), (\S*)', '[[$:uppercase()]] ', "[[$]]") Code:
{authors:'re_group(field('authors'), '(\S*), (\S*)', '[[$:uppercase()]] ', "[[$]]")'} Code:
list_re_group(src_list, separator, search_re, group_1, group_2, ...) |
![]() |
![]() |
![]() |
#7 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 126
Karma: 20236
Join Date: May 2014
Device: Kinde PW v1, Kobo H2O, Onyx Boox T68
|
Thanx for your help guys... But you lost me lol
I don't know anything about the program mode. I looked into the documentation and it frightened me ! ![]() I don't want to make you lose your time, but if you're still motivated to help me, i would need more explantions, beginner oriented please ![]() Thank you ! HS : Calibre can do so much things, it's really impressive. EDIT : Argh ! because of my english, i have confused "capitalize" and "change in uppercase", from Christie, Agatha to CHRISTIE Agatha. Sorry for that ! Last edited by myki; 06-25-2014 at 05:48 AM. |
![]() |
![]() |
![]() |
#8 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Currently you can use my program: block to "fix" the author_sort for your use. I would recommend using it inside {:'program-goes-here'} e.g.
Code:
{:'program:FN=re(field("author_sort"),"([^,]+),.+","\1");LN=re(field("author_sort"),"[^,]+,(.+)",\1");fixed_author_sort=strcat(uppercase(FN),LN)'} Either code block works, but:
![]() Also, you can use chaley's new function if you are willing to wait until Friday (and the next calibre release). Last edited by eschwartz; 06-25-2014 at 06:19 AM. Reason: grammar |
![]() |
![]() |
![]() |
#9 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
![]() ![]() Also, technically you are capitalizing all letters. ![]() |
|
![]() |
![]() |
![]() |
#10 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Capitalizing only the first letter of each word is often referred to as title case.
If it makes you feel any better, I don't know anything about Python (the language the program is written in) either. |
![]() |
![]() |
![]() |
#11 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 126
Karma: 20236
Join Date: May 2014
Device: Kinde PW v1, Kobo H2O, Onyx Boox T68
|
Thank you very much, it is very instructive !
I hesitate to change the way "author_sort" works, so finally i found a solution : I built 2 personnalised columns, from another : Code:
#author_sort_lastname with {:'uppercase('{author_sort:list_item(0,\,)}')'} #author_sort_firstname with {author_sort:list_item(1,\,)} Code:
{#author_sort_lastname} {#author_sort_firstname:re([.],)}/{series:'test($,'{series}/[{series}-{series_index:0>2s}] ','{#author_sort_lastname} {#author_sort_firstname} - ')'}{title} ![]() In the adjustments, i changed : Code:
save_template_title_series_sorting = 'strictly_alphabetic' It was very interesting, thanx again ![]() |
![]() |
![]() |
![]() |
#12 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,336
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Your first template is better written as something like Code:
{author_sort:'uppercase(list_item($, 0,\,))'} Code:
program: as = field('author_sort'); # list item removes the comma. Note that this doesn't work if there # are multiple authors or if the author name doesn't contain a comma asfn = list_item(as, 1, ','); asln = uppercase(list_item(as, 0, ',')); # Use the template function as a convenience to avoid calling format_number has_series = template('{series}/[{series}-{series_index:0>2s}]'); no_series = strcat(asln, ' ', asfn, ' -'); series_val = test(field('series'), has_series, no_series); strcat(asln, ' ', asfn, '/', series_val, ' ', field('title')) |
|
![]() |
![]() |
![]() |
#13 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,336
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Not quite true. GPM is close to required when using subtemplates. One *can* use a form of subtemplates in TPM, but it is very tricky and error prone.
|
![]() |
![]() |
![]() |
#14 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,336
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
Quote:
Code:
list_re_group(src_list, separator, include_re, search_re, group_1, group_2, ...) This thread provides an example of this function's use, to uppercase the last name of each author for a book: Code:
program: list_re_group(field('authors'), ' & ', '.', '([^,]*), (.*)', '{$:uppercase()}, ', '{$}') Code:
{authors:'list_re_group($, ' & ', '.', '([^,]*), (.*)', '[[$:uppercase()]], ', '[[$]]')'} |
|
![]() |
![]() |
![]() |
#15 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 126
Karma: 20236
Join Date: May 2014
Device: Kinde PW v1, Kobo H2O, Onyx Boox T68
|
As i said i didn't knew anything about GPM.
For me i feel more comfortable with program mode thanks to the use of variables. And the writing is more vertical, when the normal mode is more horizontal. I'm curious : Is it possible to use variables in normal mode, and how ?? Is it possible to increase the size of the police inside the modele editor ? Thanx for your help... @chaley : If i had your function, it would have been so easier for me to harmonize my authors the first time i built my database ![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Case Sensitive Regular Expression | silentguy | Calibre | 5 | 05-11-2015 05:56 AM |
Please help me with regular expression :help: | Tatjana | Library Management | 2 | 05-30-2014 05:41 PM |
Regular Expression Help | Azhad | Calibre | 86 | 09-27-2011 02:37 PM |
Regular expression help | krendk | Calibre | 4 | 12-04-2010 04:32 PM |
Help with the regular expression | Dysonco | Calibre | 9 | 03-22-2010 10:45 PM |