![]() |
#1 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
Replace <br/> in <h1>...<h1/>
I'd like to replace <br/>s with a space when they're between opening and closing h1 tags. However, something simple such as
Code:
<h1>(.*?)<br/>(.*?)<h1/> <h1>\1 \2<h1/> How to? Last edited by foosion; 12-07-2024 at 11:10 AM. Reason: Should have posted <br/> not <br> |
![]() |
![]() |
![]() |
#2 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
Odd. In my Calibre editor that works just fine.
A couple of thoughts, though (could just be typos). First, you should close your <br> tags. IOW, <br/>, not just <br>. Second, are you sure all your <h1></h1> tags will be on the same line? Because, as written, your search string won't find multiple line <h1>s. And, third (possibly related to the problem you're seeing), do you have the "Dot All" box checked at the bottom of the editor screen? With it unchecked, it works as you want for single line <h1>s. With it checked, it'll pick up multiple line <h1>s, but also those <h1>s without a <br/> and stretch the selection to the next </h1>. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
<br> was a typo - I meant <br/>
The code I posted finds all of my <h1>s, the problem is it also finds large areas of text not within <h1>...<h1/> if there isn't a <br/> inside. I do have Dot All checked, otherwise it doesn't find the <h1>...<h1/>s; none are single line. So the question is how to just find <br/> within <h1>...<h1/> with Dot All checked? IOW, when there are <h1>s without a <br/>, how to avoid stretching the selection to the next </h1>? |
![]() |
![]() |
![]() |
#4 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
<h1/> is also a typo H1 is not a self closing tag.
Code:
<h1>(.*?)\s*<br />\s*(.*?)</h1> FWIW I usually do the reverse. I insert a trailing space after \1. Code:
\1 <br />\2 ![]() |
![]() |
![]() |
![]() |
#5 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
did you have Dot all checked? That makes it Greedier than you want.
|
![]() |
![]() |
![]() |
#7 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,016
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Sometimes a heading tag is for a multi-line heading where a a separate heading tag for each line would result in erroneous auto-generated TOC.
Also a <br /> is often for a heading that is logically two lines or two long for a single line would look strange with an arbitrary position for the word wrap with big font or smaller screen. So while it's perfectly possible to find only the newlines embedded in a heading and put a space instead it may not be what is really needed. Though, in an ebook, I'd only ever have <br /> in a heading, if anywhere, as extra space elsewhere is better done with paragraph CSS. |
![]() |
![]() |
![]() |
#8 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I also do it for an attempt at (anti-ugly) control:
Chapter One Hundred Twenty-Seven I would rathe have, than a arbitrary break because it will not fit on the line.Chapter
One Hundred Twenty-Seven |
![]() |
![]() |
![]() |
#9 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 776
Karma: 1538394
Join Date: Sep 2013
Device: Kobo Forma
|
I, too, am always (we'll, often) adding <br/>s to my chapter headings just to make sure they fit decently on the page. And, sorry I forgot to mention the ending </h1> tag. I saw it, but forgot.
|
![]() |
![]() |
![]() |
#10 | |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
Quote:
I'm starting to think I need a function that first finds <h1>(.*?)</h1>, then searches \1 for <br/> and does the replace. I'm just not sure how to do this. |
|
![]() |
![]() |
![]() |
#11 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
|
![]() |
![]() |
![]() |
#12 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I do this kind of search all the time. (eg add a <span around \2 to style it a bit different)
There is something hidden in the text that is throwing the find off. One thing I found can help: Beautify all HTML. That does get rid of space spaces (or use \s* ) |
![]() |
![]() |
![]() |
#13 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 479
Karma: 41524
Join Date: Sep 2011
Device: Kobo Libra 2 & Clara BW
|
This seems to work as a regex function:
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): return '<h1>'+match.group(1).replace('<br/>',' ').replace('<br/>',' ').replace('<br/>',' ')+'</h1>' I used create TOC from headings and it's added ids throughout, including blanks ones, e.g.: Code:
<h1 id="toc_21">Security measures adopted by Atlantis/Shanghai.</h1><h1 id="toc_22"></h1> |
![]() |
![]() |
![]() |
#14 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,016
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
how to replace text with Search and Replace with regex on Calibre | darrnih | ePub | 2 | 04-02-2024 02:10 AM |
How can I replace CC 5.4.4.21 with CC 5.4.4.19 ? | Pierre-Olivier | Calibre Companion | 9 | 12-29-2023 04:43 AM |
What to Replace the Sony With | MickeyC | Which one should I buy? | 2 | 11-13-2014 10:08 AM |
save multiple search/replace, or search/replace multiple ebooks | user743 | Editor | 12 | 04-12-2014 02:38 AM |
search and replace - drops blanks in replace ? | cybmole | Conversion | 10 | 03-13-2011 03:07 AM |