View Single Post
Old 01-07-2011, 05:15 AM   #6
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
is it safe to wild card the calibre2 bit ?

e.g. would this work ?

([\w",])</p>\s+<p class="calibre\d+">([\w"“…])

or will that cause it to mess with titles & chapter headers ?

I see that different books have different class names. some do not even have calibre+digit(s) they have a different naming structure e.g. I have seen class="MsoPLainText", so maybe find
[\w",])</p>\s+<p class="[A-Za-z2-9]*">([\w"“…])

that will exclude calibre1 ?

on a related issue, I have a book with far too much space between chapter header & start of text.

the code uses 3 consecutive instances of
<p class="MsoPlainText">&nbsp;</p>

how do I test for 3 consecutive instances of that line, and replace with only 1, or maybe 2 instances ?

Last edited by cybmole; 01-07-2011 at 05:32 AM.
cybmole is offline   Reply With Quote