View Single Post
Old 04-15-2015, 10:04 AM   #1
ColMac
Connoisseur
ColMac began at the beginning.
 
Posts: 59
Karma: 10
Join Date: Apr 2012
Device: Kindle Fire
Search regex problem

My wife has a number of books that have masses of "Extra" carriage returns inserted in them making reading difficult.

A sample is shown below


Code:
<p class="calibre1">And  Alandra  looking  at  him  as  he  came  into  the  room,  found  that </p>
<p class="calibre1">although  she  had  not  wanted  to  allow  her  grandfather  one  courtesy, </p>
<p class="calibre1">she was getting to her feet. </p>
<p class="calibre1">Silently,  she  watched  and  waited  as  he  came  closer.  And  when  he </p>
<p class="calibre1">stopped and for long seconds stared at her, she saw deep frown lines </p>
<p class="calibre1">groove  on  his  forehead.  But  she  had  no  word  to  say  to  him,  and  he </p>
<p class="calibre1">none for her as he turned to the man who, keeping his eyes steady on </p>
<p class="calibre1">the two of them, had now moved from his position by the door, and </p>
<p class="calibre1">was coming in their direction. </p>
<p class="calibre1">And  it  was  left  to  Matt  Carstairs  to  introduce  the  two—the  elderly </p>
<p class="calibre1">man  who  still  had  the  gait  of  a  man  years  younger,  and  the  young </p>
<p class="calibre1">woman whose solemn face was giving nothing away of the very low </p>
<p class="calibre1">regard in which she held the other. </p>
<p class="calibre1">'This,'  said  Matt  Carstairs,  pausing  only  marginally  as  if  to  assess </p>
<p class="calibre1">how  the  older  man  would  take  it,  'this  woman  claims  to  be  your </p>
<p class="calibre1">granddaughter, sir—she says she is Edward's child.' </p>
I managed to find a saved search from Zajora that partly solves the problem

Code:
(?<![".!?>*”“…~’])</(?P<tag>\w+)>\s*<(?P=tag) [^/>]+>

Replaced with "Null"
However, it also removes the genuine end of paragraph returns (In the example above, the ones with full stops and single quote).

Is there any way to amend this search to exclude the more obvious "Genuine" carriage returns. I know that there are other ways to end a sentence other than full stops, and that any such search will not catch everything, and will get some wrong. But it would be better than what she currently has.

Thanks

Last edited by ColMac; 04-15-2015 at 10:12 AM. Reason: spelling
ColMac is offline   Reply With Quote