![]() |
#1 |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Need RegEx help (if it's even at all possible to do!)
Ugh! I don't know the first thing about RegEx, and although it's definitely something I'd love to learn, right now my problem is a little more immediate -- not to mention probably rather complicated (RegEx-wise) -- and I"m really hoping someone can help me out.
I'm working on a book with a few plays of Shakespeare, and have them all nicely coded and stuff. Shakespeare is well known for writing in verse, of course, but many of the characters' lines are also in prose. In that regard, I have two ways of coding up those lines, depending on whether it's in verse or in prose. I think you should probably be able to see what I've done here, without getting into all the CSS and everything (the only real difference has to do with what happens if/when the line wraps, whether it gets indented or not). Here's an example of a bit of verse... Code:
<p class="speaker">1 Witch</p> <p class="verse">When shall we three meet againe?</p> <p class="verse">In Thunder, Lightning, or in Raine?</p> Code:
<p class="speaker">Hamlet</p> <p class="prose">Get thee to a Nunnerie. Why would’st thou be a breeder of Sinners? I am my selfe indifferent honest, but yet I could accuse me of such things, that it were better my Mother had not borne me. I am very prowd, reuengefull, Ambitious, with more offences at my becke, then I haue thoughts to put them in imagination, to giue them shape, or time to acte them in. What should such Fellowes as I do, crawling betweene Heauen and Earth. We are arrant Knaues all, beleeue none of vs. Goe thy wayes to a Nunnery. Where’s your Father?</p> Does that make sense, what I'm doing, and why I want to do it? So basically, to use that first example of some verse, it would then look like this... Code:
<div class="pageavoid"> <p class="speaker">1 Witch</p> <p class="verse">When ſhall we three meet againe?</p> </div> <p class="verse">In Thunder, Lightning, or in Raine?</p> The problem is that I've got almost 3000 of those "speaker" lines -- and each one I would have to actually look at and see if the next line is verse or prose, and either way I then have to add (or not) the "pageavoid" div. Is there ANY way to do a RegEx search & replace that could do this for me automagically (and save me several days of utter, unbearable drudgery)? ![]() I hope I've explained the problem okay, and what it is that I'm trying to accomplish -- and thanks SO much in advance, if this is possible! ![]() Last edited by Psymon; 06-25-2016 at 02:32 AM. |
![]() |
![]() |
![]() |
#2 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 89
Karma: 185923
Join Date: May 2015
Device: iPad 1/2/Air, K3/PW2/Fire1, Kobo Touch, Samsung Tab, Nook Color/Touch
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#3 |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
|
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
You should be able to do this in the calibre editor. I'm not sure about Sigil or other text editors.
So, in the calibre editor... Start by doing a tidy/reformat the code. This makes sure the layout is the same throughout the book. Put the search/replace into regex mode. Then select a sample of what needs to be changed and press CTRL-F. This takes takes the selected text and puts it into the search field. It also escapes any of the characters in the selection that need to be escaped. It also includes the line feeds that are in the selected text. Then, change both the speakers name and the spoken text to ".*" (without the quotes) and add open and close parentheses around text. You will end up with something like the following but on only one line in the search field: Code:
(<p class="speaker">.*</p> <p class="verse">.*</p>) Code:
<div class="pageavoid">\1</div> |
![]() |
![]() |
![]() |
#5 |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Oh, awesome! At least, if it worked -- which it doesn't. I get a message saying "No replace function with the name: <div class="pageavoid">\1</div> exists." I think I can sorta see what you were getting at, though, and how that might work (if it did), even without knowing RegEx at all. Perhaps I'm just missing something in there? EDIT: I tried it just now in Sigil, too (which also has a RegEx search & replace), thinking that maybe it might work differently somehow -- and, hopefully, that this would work! -- but it didn't. No error message in there, though, it just says "No replacements made." :/ Last edited by Psymon; 06-25-2016 at 03:01 AM. |
![]() |
![]() |
![]() |
#6 |
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
Search:
Code:
\s\s<p class="speaker">(.+?)</p>\n\n\s\s<p class="verse">(.+?)</p>\n Code:
<div class="pageavoid">\n\n <p class="speaker">\1</p>\n\n <p class="verse">\2</p>\n\n </div>\n |
![]() |
![]() |
![]() |
#7 | |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Quote:
Omigod, thank you SO SO SO very much!!! ![]() 2060 changes made -- all perfectly (as far as I can see) -- in the blink of an eye! I can't tell you how incredibly grateful I am! If you're ever in Ottawa (Canada), drop me a line and I'll buy you a few beers!!! ![]() ![]() |
|
![]() |
![]() |
![]() |
#8 |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
PS. @davidfor... You get a beer, too, for taking a shot at it!
![]() ![]() |
![]() |
![]() |
![]() |
#9 |
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
|
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,821
Karma: 19162882
Join Date: Nov 2012
Location: Te Riu-a-Māui
Device: Kobo Glo
|
Quote:
Code:
p.speaker + p.verse { page-break-before: avoid; page-break-inside: avoid; } |
|
![]() |
![]() |
![]() |
#11 | |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Quote:
Oh, that's interesting -- something like that never even occurred to me. Would that wrap the page-break:avoid around both items, together, though? Or would it add it to each individually/separately? If the latter, then it wouldn't accomplish what I'm trying to, of course. |
|
![]() |
![]() |
![]() |
#12 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,821
Karma: 19162882
Join Date: Nov 2012
Location: Te Riu-a-Māui
Device: Kobo Glo
|
Quote:
p.speaker+p.verse selects the first p.verse following any p.speaker, so the page-break-before:avoid; prevents a break between p.speaker and p.verse, and page-break-inside:avoid; prevents a break inside the p.verse immediately following the p.speaker, but unlike with the wrapper div it is still possible to break inside the p.speaker. So you would also need to add Code:
p.speaker { page-break-inside: avoid; } |
|
![]() |
![]() |
![]() |
#13 | |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Quote:
Thanks for the clarification on that, Geoff! Back to the whole RegEx thing, can anyone recommend a good tutorial "for dummies" on that subject? I'm not asking anyone to do my googling for me, because I did do that even before posting my question here to the forum, but it seemed like I just couldn't wrap my head around the tutorials I found -- it was all just too "programmy" (so to speak) for me. ![]() This was something about myself that I discovered when I first got into web design back in the mid-1990s, too. I had no problem with HTML and stuff, and CSS as well when that eventually came around (I'm old!) ![]() But it would be really handy to be able to understand RegEx, for sure! I just couldn't find any that were "easy," for semi-stupid people like myself. ![]() Any recommendations on any tutorial sites out there? |
|
![]() |
![]() |
![]() |
#14 |
Interested in the matter
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 421
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook InkPad 3
|
http://www.regular-expressions.info/tutorialcnt.html
This tutorial is the best I know. I think that to solve 90% of the problems that arise in our hobby, we need to know until Grouping chapter. Good luck! |
![]() |
![]() |
![]() |
#15 | |
Chief Bohemian Misfit
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
|
Quote:
I don't know why I have such a hard time with "programmy"-type stuff like this. I do just fine with HTML/CSS, but it's as though something short-circuits in my brain and I become a complete moron with this kind of thing. :/ But I'll try my best to learn it -- no promises that I won't get stuck again and be back here for more help down the road, though! Forgive me... in advance... if that happens. ![]() Thanks again to everyone for all the help! ![]() |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex help anyone? | seanos | Editor | 17 | 04-02-2014 11:03 AM |
New help with a regex | txckie | Calibre | 2 | 08-29-2011 08:46 PM |
Help me with regex please. | eVrajka | Library Management | 5 | 08-15-2011 12:17 PM |
What a regex is | Worldwalker | Calibre | 20 | 05-10-2010 05:51 AM |
Help with a regex | A.T.E. | Calibre | 1 | 04-05-2010 07:50 AM |