View Single Post
Old 02-24-2013, 06:21 PM   #13
Dybbuk
Junior Member
Dybbuk began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Feb 2013
Device: Iphone 4
Quote:
Originally Posted by theducks View Post
(?sm)</p>\s+(.+?)\s+<p>
Should work to remove things outside those tags
Don't try this on any copy you want to be usable after you are done,

BUT YOU WERE WARNED that there are other valid things between the closing </p> and the Next <p> that should not be removed: The list is big, so I am not wasting my time typing it.
Cool! I've tried it on several epubs but it often selects p-tag stuff. Maybe I'm doing it wrong? I'm using Sigil 7, under various search settings. Earlier I tried exploiting the fact that angle brackets within paragraphs usually have a space before them with this regex:

\s<[^p/]

Which works, but only occasionally. It operates under the assumption that the epub is well-formed, with lines between non-p tags and spaces between other tags.

You're my hero! I'll definitely tinker with this regex some more!
Dybbuk is offline   Reply With Quote