is it safe to wild card the calibre2 bit ?
e.g. would this work ?
([\w",])</p>\s+<p class="calibre\d+">([\w"“…])
or will that cause it to mess with titles & chapter headers ?
I see that different books have different class names. some do not even have calibre+digit(s) they have a different naming structure e.g. I have seen class="MsoPLainText", so maybe find
[\w",])</p>\s+<p class="[A-Za-z2-9]*">([\w"“…])
that will exclude calibre1 ?
on a related issue, I have a book with far too much space between chapter header & start of text.
the code uses 3 consecutive instances of
<p class="MsoPlainText"> </p>
how do I test for 3 consecutive instances of that line, and replace with only 1, or maybe 2 instances ?
Last edited by cybmole; 01-07-2011 at 05:32 AM.
|