02-22-2012, 06:03 PM | #1 |
Member
Posts: 14
Karma: 475352
Join Date: Feb 2012
Device: Kindle Paperwhite 2
|
Replace multiple matching instances within paragraph?
I'm trying to use a regex to replace all instances of a given match (in this case line breaks) within paragraph markings. I cannot simply replace all <br/> instances in my document because some are important to have.
The document layout looks something like this: Code:
<p>some text here<br/> more text here<br/> a lot more words<br/> final part</p> Code:
<p>some text here more text here a lot more words final part</p> Could somebody here show me how to do this, or has there been a thread about this already? So far I have started with: Code:
<p>(([a-zA-Z',;. ])+<br/>)+ Thanks, Chris |
02-23-2012, 04:53 AM | #2 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
You're fighting a losing battle with this sort of thing - my suggestion would be to figure out a pattern for the 'important' breaks instead of what you're trying to do. I assume these are most likely scene breaks. Then replace the important breaks with some sort of scene break marker, e.g. *** or <p> </p><p> </p>.
Then convert to text using Markdown or Textile, and finally convert back to ePub with the heuristic 'unwrap lines' option. That won't be perfect, but it will get you about as far as Calibre can take you by itself, and you can do final edits using Sigil. |
Advert | |
|
02-23-2012, 09:53 AM | #3 |
Member
Posts: 14
Karma: 475352
Join Date: Feb 2012
Device: Kindle Paperwhite 2
|
Thanks for the suggestion.
I did look further, and did notice that every break I wanted to remove in the document was '<br/> ' (with a space), whereas the ones to keep did not have a space after them. So that sufficed to clean it up. But it sounds like in general this thing is hard to do in future. I will try some of the programs you mentioned. Thanks, Chris |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Trouble with multiple content server instances | perx | Calibre | 3 | 02-17-2012 01:24 AM |
Regular expressions - Single back-reference, multiple instances? | David Kudler | Sigil | 10 | 12-17-2011 12:05 PM |
Preference: Paragraph indent or a little paragraph spacing? | 1611mac | General Discussions | 48 | 11-11-2011 12:43 AM |
replace multiple elements in url ? | xXxXxXxXxXx | Recipes | 1 | 05-16-2011 09:46 AM |
Edit MultiData Search and Replace to multiple fields | Doug-W | Library Management | 1 | 02-22-2011 03:17 PM |