View Single Post
Old 09-11-2021, 03:16 PM   #3
retiredbiker
Evangelist
retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.retiredbiker ought to be getting tired of karma fortunes by now.
 
retiredbiker's Avatar
 
Posts: 450
Karma: 3886916
Join Date: May 2013
Location: Ontario, Canada
Device: Kindle KB, Oasis, Pop_Os!, Kobo Forma
I would try something more simple:

chapter.*?\d+
and
report.*?•.*?discuss

But I have found that headers and footers in OCR'd pdfs often come across with strange spacing, text scannos, and all sorts of cruft. Often you get some, but not others. So I do this in the Editor, after conversion, where I have a chance to find the exceptions. Doing it that way also gives you a chance to re-connect text that was separated by the header or footer at the page break.
retiredbiker is offline   Reply With Quote