![]() |
Regex Help: Find page number & Replace+Remove 2x Line Breaks in Sigil
:help:
Hi everyone! I went through the forum posts on using Regex in Sigil to find and replace characters in an ebook and I tried the methods suggested resulting in abject failure. Please help. I have an e-book with 400+ page numbers appearing like this: Code:
<p>I want to keep this text</p>I need to remove the page number line and the empty lines above and below it. I did this in Sigil (Mode: Regex): Find: <p>([0-9]+)</p> Replace: /1 ------> (space - slash - one) It found the page number but only removed the <p></p> tags plus the first 2 digits, leaving the last digit in between intact. (In this example, "0"). It also did not remove the empty lines before and after it. Please could someone help to correct my code so that I will end up with this: Code:
<p>I want to keep this text</p>Contre-jour |
generally there's no reason to remove the blank lines in html, apart from just aesthetics.
if you're just trying to remove the <p>number</p> then this should work: Code:
find: |
As mzmm mentioned, the blank line between paragraphs in code view is mostly irrelevant. If there's a blank line between paragraphs of the rendered html that you want to eliminate, then that's a styling/css issue. Removing the blank line in code view won't affect the rendered text at all (and Tidy/Pretty Print will just put the blank line back unless you have it turned completely off).
|
Are the empty lines hard coded? It doesn't look like it in your example but if there is...
Is there a <p><br /></p> or a <p> </p> or something like that?? If there is, you could use a "\s*" between the groups to find any space between. Something along these lines (assuming the blank lines are hard coded as "<p> </p>" : find: <p> </p>\s*<p>\d+</p>\s*<p> </p> replace: {nothing - empty} That will find a blank line before, the line with the number, and a blank line after. Cheers! |
As an additional to what the others have said, once you remove the <p>page#</p> Siglil will remove the extra blank line when you click save. You won't be ending up with 2 blank lines if you just remove the page number line so there is no need to try and remove them with find/replace. And of course blank lines in View Code do not show up when reading.
If however you want to remove the spaces between paragraphs that you see when reading then you need to set the paragraph margins in your CSS sheet: p { margin-top: 0; margin-bottom: 0; } The above will affect ALL <p> tags, so if you need spacing in a few paragraphs (scene changes) you need to add a scene change tag, I use: .scenechange { margin-top: 0.25em; margin-bottom: 0.25em' } and then: <p class="scenechange"> </p> |
it just occurred to me that if you're cleaning up an epub that's been generated by Pages that you might end up seeing the blank spaces in the html being rendered in the reader.
i come across Code:
* {white-space: pre;} |
Solved!
While waiting for a reply, I dug deep, played around with the code and got it :2thumbsup
Find: <p>[0-9]+</p> Replace: Nothing In the code view, it looks like there are line breaks in between my paragraph but in the book view those lines are not visible so I didn't have to put in anything to remove line breaks such as \n. Not sure why the other posts were going on and on about \1 and all that. It confused me. I apologise if I wasted your time. These may look easy peasy to many but it is a struggle for me without any programming knowledge.:cool: Thanks! |
glad you worked it out.
\1 would be for reinserting (re-placing) a group that you've captured in the find field. great reference here: http://www.regular-expressions.info/brackets.html happy epub-ing |
Quote:
Quote:
|
Quote:
|
| All times are GMT -4. The time now is 11:00 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.