Need help with regex
Hi,
Firstly let me say that I am a very rudimentary user of regex. Most of it is beyond my comprehension.
I have some eBooks that were clearly produced by less than spectacular OCR software.
Accordingly, the formatting ranges from quite good to really bad.
One of the main problems is line breaks in the wrong places (eg in the middle of a sentence), making the text very difficult to follow.
In F&R I have used this "[a-z]</p><p class="calibre_1">" - or similar - to quite successfully find these instances, but the problem is that the entirety of the matched regex is selected and I cannot for the life of me work out how to get the replace function to disregard the [a-z] component of the result in order to avoid what can be hundreds of manual interventions to fix all the errors.
Any assistance is gratefully accepted.
thanks
Paul
Last edited by jordy1955; 06-17-2022 at 09:02 PM.
|