I would try something more simple:
chapter.*?\d+
and
report.*?•.*?discuss
But I have found that headers and footers in OCR'd pdfs often come across with strange spacing, text scannos, and all sorts of cruft. Often you get some, but not others. So I do this in the Editor, after conversion, where I have a chance to find the exceptions. Doing it that way also gives you a chance to re-connect text that was separated by the header or footer at the page break.
|