|
|
Thread Tools | Search this Thread |
02-01-2013, 08:26 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
Regex Help: Find page number & Replace+Remove 2x Line Breaks in Sigil
Hi everyone! I went through the forum posts on using Regex in Sigil to find and replace characters in an ebook and I tried the methods suggested resulting in abject failure. Please help. I have an e-book with 400+ page numbers appearing like this: Code:
<p>I want to keep this text</p> <p>100</p> <p>I want to keep this text</p> I need to remove the page number line and the empty lines above and below it. I did this in Sigil (Mode: Regex): Find: <p>([0-9]+)</p> Replace: /1 ------> (space - slash - one) It found the page number but only removed the <p></p> tags plus the first 2 digits, leaving the last digit in between intact. (In this example, "0"). It also did not remove the empty lines before and after it. Please could someone help to correct my code so that I will end up with this: Code:
<p>I want to keep this text</p> <p>I want to keep this text</p> <p>I want to keep this text</p> Contre-jour |
02-01-2013, 08:38 AM | #2 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
generally there's no reason to remove the blank lines in html, apart from just aesthetics.
if you're just trying to remove the <p>number</p> then this should work: Code:
find: <p>\d+</p> |
02-01-2013, 08:54 AM | #3 |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
As mzmm mentioned, the blank line between paragraphs in code view is mostly irrelevant. If there's a blank line between paragraphs of the rendered html that you want to eliminate, then that's a styling/css issue. Removing the blank line in code view won't affect the rendered text at all (and Tidy/Pretty Print will just put the blank line back unless you have it turned completely off).
|
02-01-2013, 08:56 AM | #4 |
A Hairy Wizard
Posts: 3,095
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Are the empty lines hard coded? It doesn't look like it in your example but if there is...
Is there a <p><br /></p> or a <p> </p> or something like that?? If there is, you could use a "\s*" between the groups to find any space between. Something along these lines (assuming the blank lines are hard coded as "<p> </p>" : find: <p> </p>\s*<p>\d+</p>\s*<p> </p> replace: {nothing - empty} That will find a blank line before, the line with the number, and a blank line after. Cheers! |
02-01-2013, 09:34 AM | #5 |
Evangelist
Posts: 490
Karma: 1665031
Join Date: Nov 2010
Location: Vancouver Island, Nanaimo
Device: K2 (retired), Kobo Touch (passed to the wife), KGlo, Galaxy TabPro
|
As an additional to what the others have said, once you remove the <p>page#</p> Siglil will remove the extra blank line when you click save. You won't be ending up with 2 blank lines if you just remove the page number line so there is no need to try and remove them with find/replace. And of course blank lines in View Code do not show up when reading.
If however you want to remove the spaces between paragraphs that you see when reading then you need to set the paragraph margins in your CSS sheet: p { margin-top: 0; margin-bottom: 0; } The above will affect ALL <p> tags, so if you need spacing in a few paragraphs (scene changes) you need to add a scene change tag, I use: .scenechange { margin-top: 0.25em; margin-bottom: 0.25em' } and then: <p class="scenechange"> </p> Last edited by Danger; 02-01-2013 at 09:41 AM. |
02-01-2013, 10:29 AM | #6 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
it just occurred to me that if you're cleaning up an epub that's been generated by Pages that you might end up seeing the blank spaces in the html being rendered in the reader.
i come across Code:
* {white-space: pre;} |
02-01-2013, 10:35 AM | #7 |
Junior Member
Posts: 3
Karma: 10
Join Date: Nov 2012
Device: Kindle
|
Solved!
While waiting for a reply, I dug deep, played around with the code and got it
Find: <p>[0-9]+</p> Replace: Nothing In the code view, it looks like there are line breaks in between my paragraph but in the book view those lines are not visible so I didn't have to put in anything to remove line breaks such as \n. Not sure why the other posts were going on and on about \1 and all that. It confused me. I apologise if I wasted your time. These may look easy peasy to many but it is a struggle for me without any programming knowledge. Thanks! |
02-01-2013, 10:41 AM | #8 |
Groupie
Posts: 171
Karma: 86271
Join Date: Feb 2012
Device: iPad, Kindle Touch, Sony PRS-T1
|
glad you worked it out.
\1 would be for reinserting (re-placing) a group that you've captured in the find field. great reference here: http://www.regular-expressions.info/brackets.html happy epub-ing |
02-01-2013, 10:45 AM | #9 | |
Grand Sorcerer
Posts: 27,550
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
So naturally, everyone one wanted you to know that /1 was syntactically incorrect. It should have been \1. |
|
02-01-2013, 10:47 AM | #10 |
Well trained by Cats
Posts: 29,804
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
Tags |
line breaks, regex, regular expressions, sigil |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Sigil Wildcards/Regex Find/Replace | Adman35 | Sigil | 7 | 08-16-2014 01:02 PM |
Regex find and replace | SanatyrZeo | Sigil | 5 | 10-29-2012 07:03 AM |
Find/Replace bogus line breaks in Text editor, w/Regular Expression | scubaddictions | Conversion | 15 | 07-21-2011 08:52 AM |
RegEx find and replace | iblesq | Sigil | 1 | 01-10-2011 09:26 PM |
REGEX find and replace help please | potestus | Sigil | 13 | 09-18-2010 04:14 PM |