![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
RegEx Question: H1 ALL CAPS to All Caps
I've been reformatting a number of books and the chapter titles are in all caps.
Some times there's others like a section. I got it to the point where the tags are pretty consistent, but making text title case in BV is a lot of work Is there a good F-R that will convert things like <h1>CHAPTER ONE</h1> into <h1>Chapter One</h1> I can use it for other tags but I'm stuck on the title case part There is a TitleCase stored clip in Sigil but I couldn't seem to get it to work right: \u\1 Thanks |
![]() |
![]() |
![]() |
#2 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Replace:<h1>Chapter \1\L\2\E</h1> For more examples see this older post of mine. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
I actually use this one. It seems to suit me well. It fails on accented characters though, so it could be more robust (I just haven't gotten around to updating it). Also, it doesn't work specifically on <h1>, it just works on all words that are in ALL CAPS.
Search: (\b)([A-Z])([A-Z]+) Replace: \1\2\L\3 It is really helpful when I want to change the text that appears in the Sigil auto-generated TOC, but want to leave the displayed text as ALL CAPS. Blue: \b is called a "word boundary". For more info on this, see: http://www.regular-expressions.info/wordboundaries.html Green: This captures the first capital letter of the word. It makes sure it stays capital in the replace. Red: This captures all the other capital letters, and replaces them with lowercase. (NEVER Replace All while using this, always replace one by one) Example: Code:
<h1 title="1. THIS IS CHAPTER ONE">THIS IS CHAPTER ONE</p> Code:
<h1 title="1. This Is Chapter One">THIS IS CHAPTER ONE</p> http://grammar.about.com/od/grammarf...italstitle.htm |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@Tex2002ans -- thanks, I tried something similar, but the document had a lot of all caps acronyms that were being found. I tried the 'Replace/Find Next' but that was taking a lot of time
@Doitsu -- that looks similar to something I Googled (which didn't work), except for the :upper:'s. The Google used /u which didn't seem to work after I bracketed it with the H1 tags. Is there a difference? I'll try your suggestion when I get home Paul |
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,358
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
\u and \l only affect one character.
\U and \L keep changing the case of characters until \E is encountered. @Tex2002ans: The \b word boundary is handy, but it can also result in some confusing behavior if you run into any typographic quotes, apostrophes, or other unicode characters contained in the text. In ebooks, I always try to use the unicode switch to force \b to accommodate such things. Last edited by DiapDealer; 02-05-2014 at 09:11 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,306
Karma: 78876004
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
My minor suggestion would be
Code:
Find: (<h[1-9]>)([A-Z])([A-Z| ]+) Replace: \1\2\L\3 Additional punctuation symbols could be added by adding them to the [A-Z| ] grouping; ie [A-Z| |,|;] etc |
![]() |
![]() |
![]() |
#7 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@Doitsu --
Quote:
However, when I followed the suggestion, it didn't quite get what I was hoping to accomplish since there could be anything between the H1 (or other) tags and not literally "CHAPTER" What I thinking I was looking for was a more general purpose F-R that would make any text between tags and Title Case it. The <h1>CHAPTER ONE</h1> was really just a (poor) example. A general example might be <tag>TEXT TEXT TEXT</tag> becoming <tag>Text Text Text</tag> since some of the H#'s are for "PART 2" and "CHAPTER 3" and sometimes just "ONE", "TWO", ... In the best of all possible worlds, I'd have 2 or 3 stored clips to title case H1, H2, and H3. When I had an oddball set of tags, I'd do that F&R manually, but I'd be smarter Thanks |
|
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@PeterT -- thanks, sort of close
I tried it with Minimal Match on and off just to see Code:
Original <h1>AAAAAAAAAAAAAAAAAA BBBBBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB</h1> Minimal match unchecked <h1>Aaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbb bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb</h1> Minimal match checked <h1>AaAAAAAAAAAAAAAAAA BBBBBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB</h1> Do you or any RegEx gurus have some tweaks?? Hoped for result <h1>Aaaaaaaaaaa Bbbbbbbbbbbb Cccccccccccc</h1> Thanks again |
![]() |
![]() |
![]() |
#9 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
|
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
if you compromise a little, then
1 with a single pass you can change everything to lower case then 2 with a 2nd pass you can easily capitalise the 1st letter of the 1st word so you've gone from CHAPTER ONE to Chapter one a 3rd ( harder to write) pass could then probably find & capitalise the 2nd word.... it comes down to how much time you want in invest in fancy coding compared with just manually retyping the offending titles |
![]() |
![]() |
![]() |
#11 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Find:([[:upper:]])([[:upper:]]+\s*) Replace:\1\L\2\E This will find an uppercase letter followed by one or more uppercase letters and zero or more white-space character anywhere in the text and can be used to convert several uppercase words in a row. |
|
![]() |
![]() |
![]() |
#12 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
but for less than 20 chapter headers, I could probably manually edit them faster than I would write & debug all of the above code |
|
![]() |
![]() |
![]() |
#13 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,909
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() Search Look (and maybe replace or edit) Search ... If it can't be done with a Quick and Dirty ![]() |
|
![]() |
![]() |
![]() |
#14 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
@all -- got some real good ideas
I added a little CSS (this is only a fragment from my test book Code:
h1,h2,h3,h4,h5 { text-transform:capitalize; } Find ![]() Replace: \1\L\2> The 'tweaking' is making acronyms all upper but that is a lot less effort Still very open to ideas and suggestion |
![]() |
![]() |
![]() |
#15 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I have a toc.ncx already generated, and/or I have preview open up on the left-half of my screen. I quickly jump to where I need to be using TOC/Preview, and then Find/Replace what I need in Code View. I don't use this Find/Replace throughout the entire book (unless this particular book has some odd ALL CAPS floating around it that need to be checked/fixed). It isn't the quickest/most automated way to do it, but while I am fixing this I am also quickly just scanning around/taking a look to see if I can catch any other typos/errors. Quote:
![]() Last edited by Tex2002ans; 02-05-2014 at 07:19 PM. |
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Is there RegEx to <span> ALL CAPS text? | phossler | Sigil | 4 | 03-10-2013 02:43 PM |
Help with regex expression for words in all caps | bfollowell | Sigil | 9 | 01-20-2012 05:11 PM |
small caps | yuxi_kelly | ePub | 20 | 06-05-2011 12:04 AM |
Historical Item Question: Candle Caps? | Steven Lake | Writers' Corner | 3 | 03-19-2011 08:13 AM |
Unutterably Silly ANGRY CAPS | Not_A_Crook | Lounge | 56 | 12-10-2009 01:16 AM |