![]() |
Quote:
Quote:
Yes that is my method although I am making epub3 so my replace is – \1 – Previously I was searching for – (.*) – but very often you can have several dashes in the same paragraph but not the same sentence and as I said, I frequently work on books with hundreds of dashes so the more efficient and refined my search is the better. I search first for a complete set followed by a comma, then complete set no comma, then one dash + comma and then one dash alone. I might be able to combine some of those steps now that I have the more refined search, we'll see. |
Quote:
Oops, just noticed that my code got eaten... my replace is –# 160 ;\1# 160 ;– only without the spaces, obviously. :p Just did a book today with 2313 dashes in it (x_x) so the new and improved regex was greatly appreciated. |
Quote:
Code:
Hey, we've got code tags!! |
Quote:
Quote:
Code:
–&[ noparse]#160;[/noparse]\1&[ noparse]#160;[/noparse]–– \1 – The forum "helpfully" decides to substitute characters, but in this case, we don't want them, so we tell it "not to parse". :D |
Jeez, even the code tag eats stuff. :eek: Wasn't aware of that. It shouldn't, should it?
|
Quote:
|
Hi!
I have 200 ".xhtml" files inside an epub and I need to delete the line 14 of each one of them. Is it possible to do this with Regex? |
Quote:
|
Quote:
The simplest is select by the text and replace with 'x' ('x' can be nil ) Can you post the offending :rofl: line? Is it always the same? If not, what part is different? |
If "line 14" means the fourteenth line of code (as opposed to the book's textual content), the regex could be:
Code:
Find: ((?:.*?\R){13}).*?\RThat's a risky operation, since newlines can be easily added or removed by many automatic formatting processes, but if one is absolutely certain that the text between the thirteenth and the fourteenth newline in the code must be deleted (along with the fourteenth newline), that's one way to do it. If, instead, "line 14" is not a reference to the lines of xhtml code, you can ignore this contribution and should provide some meaningful pattern as the others already said. |
This is what I want to delete (marked as red):
https://i.imgur.com/5fcJ9mQ.gif As you can see I have a duplicate title in every file, so what I want is to delete that specific line (14) in all of my xhtml files. |
you can delete the (</h1>)\s+<p>.+?</p>
replace just with the \1 <<I never perfected the use without capture. I just put back the trigger. You can replace the .+? with precise text |
Quote:
|
I want to be able to copy and replace with saved text kond of like how (\d+) saves the number or more than 1 number in a row and then you can output it with \1,\2, etc. Is there a way to do this with all text in between tags. An example would be:
Code:
<p><b>20</b> Words and stuff. Why are there words?<br/><b>20</b> Words and stuff. Why are there words?</p>Find Code:
<p><b>(\d+)</b>\s … <br/><b>\d+</b>\s … </p>Code:
<h4>\1</h4></br><p> … </p><p> … </p>Find Code:
<p><b>(\d+)</b>\s(.*?)<br/><b>\d+</b>\s(.*?)</p>Code:
<h4>50:\1</h4><p>\2</p><p>\3</p> |
Is it possible to make a regex to turn a phrase with "fake small caps" into a sentence-case phrase, whilst also handling the occasional capitalised proper name in the middle? It must:
1. remove the spans; 2. put all the text between the spans into lower case, leaving the letters outside the spans in upper case; 3. (this is the tricky part) there may be one span on the whole phrase OR there may be several on different parts of the phrase, so it may be necessary to do a multi-part regex. Example: Find this: Code:
<span class="Cap">F</span><span class="SmallCap">IRST WORD OF THE SENTENCE IS ALWAYS CAPITALISED,</span> <span class="Cap">O</span><span class="SmallCap">OTHER</span> <span class="Cap">W</span><span class="SmallCap">WORDS IN THE SENTENCE MAY OR MAY NOT BE CAPITALISED</span>First word of the sentence is always capitalised, Other Words in the sentence may or may not be capitalised If there is just one span I can manage it but since there can be two or three (or potentially more) spans I am not sure how to manage those possibilities. |
| All times are GMT -4. The time now is 07:52 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.