View Single Post
Old 09-11-2021, 06:17 AM   #1
marcio_oliveira
Junior Member
marcio_oliveira began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Sep 2021
Device: none
Question pdf to epub regex unicode character match not working

Hello, I'm trying to convert a pdf book to epub that has a header and a footer I'd like to remove. The header has the chapter name, the symbol • and the page number, for example “Chapter 3. Interfacing with Humans • 41” and the footer is "report erratum • discuss".

I've have tried a few ways to match this header and footer:
/.+ • [0-9]+$/g
report erratum • discuss

/.+ \u2022 [0-9]+$/g
report erratum \u2022 discuss

/.+ \W [0-9]+$/g
report erratum \W discuss

but non of these work, I would be glad if someone could help, thanks!

I'm using sr1-search and sr2-search using the ebook-convert cli.
marcio_oliveira is offline   Reply With Quote