View Single Post
Old 03-20-2011, 11:48 AM   #7
Faster
Connoisseur
Faster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of light
 
Posts: 61
Karma: 12096
Join Date: Sep 2010
Location: Tasmania
Device: Sony PRS 650
Hi again. First the expression I gave you can be cleaned by removing the backslashes within the character class. Realized this two minutes after posting, but I knew it wouldn't do any harm Here's the change:
Code:
([^.:\?\!"”'’0-9])<br class="calibre1" />
becomes
([^.:?!"”'’0-9])<br class="calibre1" />
The reason is that only ]\^- need to be escaped within a character class.

Toxaris I think JustinD has performed the Regex and lost the original so he wants to know how to deal with the outcome.
My first suggesion would be to download the original file again if possible and follow Toxaris' suggestion.
If that's not an option you would need to decide at which punctuation characters you want to create a paragraph break. For example if you want the breaks where there is a full stop followed by a space:
Find:
Code:
(.)space
Replace:
Code:
\1<p>
The reason for the space is that you don't want to find the full stop just before some quote mark, for example:
Code:
"That's my dog." he said.
If you want to find more than the full stop just include them in a character class [...].
Example find:
Code:
([.!?])space
- but nothing is going to make it look correct as you'll be making every sentence into a paragraph.
Faster is offline   Reply With Quote