Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-22-2012, 06:03 PM   #1
murphycc
Member
murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.
 
Posts: 14
Karma: 475352
Join Date: Feb 2012
Device: Kindle Paperwhite 2
Replace multiple matching instances within paragraph?

I'm trying to use a regex to replace all instances of a given match (in this case line breaks) within paragraph markings. I cannot simply replace all <br/> instances in my document because some are important to have.

The document layout looks something like this:
Code:
<p>some text here<br/> more text here<br/> a lot more words<br/> final part</p>
I would like it to become:
Code:
<p>some text here more text here a lot more words final part</p>
While I am pretty good at advanced regular expressions, I am thinking that what I'm asking for here would require back references to replace the <br/>'s with nothing.

Could somebody here show me how to do this, or has there been a thread about this already?

So far I have started with:
Code:
<p>(([a-zA-Z',;. ])+<br/>)+
and that seems to work in this regex tester program I am using, but what would I enter into the replace field in Calibre for this? Do I need back-references?

Thanks,
Chris
murphycc is offline   Reply With Quote
Old 02-23-2012, 04:53 AM   #2
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You're fighting a losing battle with this sort of thing - my suggestion would be to figure out a pattern for the 'important' breaks instead of what you're trying to do. I assume these are most likely scene breaks. Then replace the important breaks with some sort of scene break marker, e.g. *** or <p>&nbsp</p><p>&nbsp</p>.

Then convert to text using Markdown or Textile, and finally convert back to ePub with the heuristic 'unwrap lines' option. That won't be perfect, but it will get you about as far as Calibre can take you by itself, and you can do final edits using Sigil.
ldolse is offline   Reply With Quote
Advert
Old 02-23-2012, 09:53 AM   #3
murphycc
Member
murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.murphycc ought to be getting tired of karma fortunes by now.
 
Posts: 14
Karma: 475352
Join Date: Feb 2012
Device: Kindle Paperwhite 2
Thanks for the suggestion.

I did look further, and did notice that every break I wanted to remove in the document was '<br/> ' (with a space), whereas the ones to keep did not have a space after them. So that sufficed to clean it up.

But it sounds like in general this thing is hard to do in future. I will try some of the programs you mentioned.

Thanks,
Chris
murphycc is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble with multiple content server instances perx Calibre 3 02-17-2012 01:24 AM
Regular expressions - Single back-reference, multiple instances? David Kudler Sigil 10 12-17-2011 12:05 PM
Preference: Paragraph indent or a little paragraph spacing? 1611mac General Discussions 48 11-11-2011 12:43 AM
replace multiple elements in url ? xXxXxXxXxXx Recipes 1 05-16-2011 09:46 AM
Edit MultiData Search and Replace to multiple fields Doug-W Library Management 1 02-22-2011 03:17 PM


All times are GMT -4. The time now is 05:25 AM.


MobileRead.com is a privately owned, operated and funded community.