Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 12-20-2018, 11:03 AM   #1
BlackCanopus
Junior Member
BlackCanopus began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2018
Device: Android Phone, Samsung Galaxy Note 5, Moon + Reader Pro
RegEx question: Phrases starting with a small letter only

Hi

I am new to RegEx, but I want to use 'Tag mapper' and RegEx to put my tags in order (I imagine you are already cursing me ).

I want to capitalize all the tags that begin with a small letter.
My ideal conversion is "this test" -> "This Test" but I think it cannot be. So, "this test" to "This test" will also do.

I tried this:
Tage mapper > Capitalize the tag, if it matches pattern:
Code:
^[a-z].+?$
It capitalizes phrases such as:
Code:
this test
but it also capitalizes phrases such as
Code:
John Milton
and converts it to
Code:
John milton
I don't want that to happen. This phrase is already capitalized, it does not beging with a small letter, why does the RegEx statement validate it?

It seems I'm doing something wrong. Your help would be appreciated.
BlackCanopus is offline   Reply With Quote
Old 12-20-2018, 09:54 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
iirc the regex is treated as case-insensitive which is why a-z matches upper case letters as well.
kovidgoyal is offline   Reply With Quote
Advert
Old 12-21-2018, 05:07 AM   #3
BlackCanopus
Junior Member
BlackCanopus began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2018
Device: Android Phone, Samsung Galaxy Note 5, Moon + Reader Pro
Quote:
Originally Posted by kovidgoyal View Post
iirc the regex is treated as case-insensitive which is why a-z matches upper case letters as well.
RegEx cheat-sheets that I have say that it should be case-sensitive and i is used for making it case-insensitive.

Example:
https://www.rexegg.com/regex-modifiers.html

I searched StackOverflow but all of the questions were about how to make RegEx case-insensitive.

Perhaps the RegEx engine that Calibre uses is case-insensitive? But if so, is there a way to make it case-sensitive?


=== EDIT ===
This text has been quoted from the link above:

Quote:
For several engines, note that there are two ways of turning on case-insensitive matching: as an inline modifier (?i) or as an option in the regex method or function.

Inline Modifier (?i)
In .NET, PCRE (C, PHP, R…), Perl, Python, Java and Ruby (but not JavaScript), you can use the inline modifier (?i), for instance in (?i)cat. See the section on inline modifiers for juicy details about three additional features (unavailable in Python): turning it on in mid-string, turning it off with (?-i), or applying it only to the content of a non-capture group with (?i:foo)

.NET
Apart from the (?i) inline modifier, .NET languages have the IgnoreCase option. For instance, in C# you can use:

var catRegex = new Regex("cat", RegexOptions.IgnoreCase);

Perl
Apart from the (?i) inline modifier, Perl lets you add the i flag after your pattern's closing delimiter. For instance, you can use:

if ($the_subject =~ m/cat/i) { … }

PCRE (C, PHP, R…)
Note that in PCRE, to use case-insensitive matching with non-English letters that aren't part of your locale, you'll have to turn on Unicode mode—for instance with the (*UTF8) special start-of-pattern modifier.

Apart from the (?i) inline modifier, PCRE lets you set the PCRE_CASELESS mode when calling the pcre_compile() (or similar) function:

cat_regex = pcre_compile( "cat", PCRE_CASELESS,
&error, &erroroffset, NULL );
Perhaps in the Calibre code, RegEx function has been called with case-insensitive command option? If yes, would it be possible to turn it off, please? :P

Last edited by BlackCanopus; 12-21-2018 at 05:11 AM.
BlackCanopus is offline   Reply With Quote
Old 12-21-2018, 05:58 AM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,575
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Elsewhere in calibre Title Case ups first char in each word, Capitalise only ups the first char in first word.

FX right click Series Name cell, select Change Case->Title Case will change dog's breakfast to Dog's Breakfast whereas Change Case->Capitalise will change it to Dog's breakfast.

Seems a bit odd the Tag Mapper does not offer Titlecase.

BR
BetterRed is online now   Reply With Quote
Old 12-21-2018, 07:02 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It is case-insensitive deliberately, i/e/ the ignore case flag is passed by default. You can turn it off IIRC by prefixing with a minus, something like this:

(?-i:...)

see https://pypi.python.org/pypi/regex
kovidgoyal is offline   Reply With Quote
Advert
Old 12-21-2018, 08:14 AM   #6
BlackCanopus
Junior Member
BlackCanopus began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2018
Device: Android Phone, Samsung Galaxy Note 5, Moon + Reader Pro
Quote:
Originally Posted by BetterRed View Post
Elsewhere in calibre Title Case ups first char in each word, Capitalise only ups the first char in first word.

FX right click Series Name cell, select Change Case->Title Case will change dog's breakfast to Dog's Breakfast whereas Change Case->Capitalise will change it to Dog's breakfast.

Seems a bit odd the Tag Mapper does not offer Titlecase.

BR
This (adding 'Title Case' to 'Tag mapper') is a good solution.

Quote:
It is case-insensitive deliberately, i/e/ the ignore case flag is passed by default. You can turn it off IIRC by prefixing with a minus, something like this:

(?-i:...)

see https://pypi.python.org/pypi/regex
Sadly I couldn't find it, but what BetterRed said is better. Would you please add Title Case to Tag Mapper?
BlackCanopus is offline   Reply With Quote
Old 12-21-2018, 08:53 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I dont really see the point, tags are not titles, so why would title case be appropriate for them? You want to turn off case insensitivity, do it like I showed you


Code:
(?-i:^[a-z].+?$)
kovidgoyal is offline   Reply With Quote
Old 12-21-2018, 09:32 AM   #8
BlackCanopus
Junior Member
BlackCanopus began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2018
Device: Android Phone, Samsung Galaxy Note 5, Moon + Reader Pro
Quote:
Originally Posted by kovidgoyal View Post
I dont really see the point, tags are not titles, so why would title case be appropriate for them? You want to turn off case insensitivity, do it like I showed you


Code:
(?-i:^[a-z].+?$)
The code you gave doesn't work. But it doesn't matter anymore, I will find another solution. Thanks anyway.

As for the reason (for those who may be interested), titles such as John Milton, James Joyce, etc. may be used as tags. Note that these are not author names, but tags for books about those authors. This is used in literary criticism, I can imagine many other scenarios that use titles as tags. For example:

Code:
Title:	Novel Destinations, 2nd Edition
Authors:	Shannon McKenna Schmidt & Joni Rendon
Tags:	Charlotte Bronte, Emily Bronte, Ernest Hemingway, Fiction, Jack Kerouac, James Joyce, Jane Austen, Kafka, Literary Criticism & Collections, Literary Studies, Louisa May Alcott, Mark Twain, Novel, Virginia Woolf
Published:	May 2017
Publisher:	National Geographic Society
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	174
Size:	59.0 KB
ID:	168531  
BlackCanopus is offline   Reply With Quote
Old 12-21-2018, 10:24 AM   #9
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,809
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by kovidgoyal View Post
I dont really see the point, tags are not titles, so why would title case be appropriate for them?
I treat/use Tags as Keywords (What coding I used to do, I used CamelCase for builtin functions)

some folk only use lowercase, others the hated (shouting) Uppercase
there will always be the rule busters: LITRPG, SF... that need TLC
theducks is online now   Reply With Quote
Old 12-21-2018, 02:26 PM   #10
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,575
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Another case for using titlecase on Tags is place names - New York, West Australia, Tunbridge Wells etc. Novels are often just as much about places as they are about people.

BR
BetterRed is online now   Reply With Quote
Reply

Tags
regex, tag mapper


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
regex capitalize first letter larysa Editor 8 10-23-2017 12:04 PM
Shortcut for Jumping to Author (starting letter)? trumphodge Library Management 3 04-01-2016 12:30 PM
Regex to find small letter followed by capital? Vortex Library Management 2 03-18-2016 06:16 AM
Search for first author_sort entry starting with letter x StillReading Calibre 3 10-13-2015 02:45 AM
another regex puzzle - detect capitalised phrases cybmole Sigil 6 02-24-2012 09:04 AM


All times are GMT -4. The time now is 10:20 PM.


MobileRead.com is a privately owned, operated and funded community.