Quote:
Originally Posted by kiwidude
@rifka - you can specify regular expressions in the Seach ePub feature of this plugin. So for instance something like this would find those cases above:
Code:
[A-Za-z,]</p>\s+<p[^>]*>[a-z]
Of course you may get a lot of false positives as well depending on the type of book (things like inlined poetry etc) - you really need to open the book in Sigil and apply similar expressions to step through each case and decide whether it is really a problem or not...
|
Thank you for your help.
Kiwidude can I suggest this as an additional check. I can send you a bunch of examples. I think they are mostly the result of bad conversions from PDF. Some of them are not that bad they just have the page numbers and/or author and/or publisher every few pages. Others are completely unreadable. Let me know if you would like the samples.