MobileRead Forums - View Single Post - pdf to epub regex unicode character match not working

marcio_oliveira · 09-11-2021, 07:17 AM

Hello, I'm trying to convert a pdf book to epub that has a header and a footer I'd like to remove. The header has the chapter name, the symbol • and the page number, for example “Chapter 3. Interfacing with Humans • 41” and the footer is "report erratum • discuss".

I've have tried a few ways to match this header and footer:
/.+ • [0-9]+$/g
report erratum • discuss

/.+ \u2022 [0-9]+$/g
report erratum \u2022 discuss

/.+ \W [0-9]+$/g
report erratum \W discuss

but non of these work, I would be glad if someone could help, thanks!

I'm using sr1-search and sr2-search using the ebook-convert cli.

09-11-2021, 07:17 AM	#1
marcio_oliveira Junior Member Posts: 1 Karma: 10 Join Date: Sep 2021 Device: none	pdf to epub regex unicode character match not working Hello, I'm trying to convert a pdf book to epub that has a header and a footer I'd like to remove. The header has the chapter name, the symbol • and the page number, for example “Chapter 3. Interfacing with Humans • 41” and the footer is "report erratum • discuss". I've have tried a few ways to match this header and footer: /.+ • [0-9]+$/g report erratum • discuss /.+ \u2022 [0-9]+$/g report erratum \u2022 discuss /.+ \W [0-9]+$/g report erratum \W discuss but non of these work, I would be glad if someone could help, thanks! I'm using sr1-search and sr2-search using the ebook-convert cli.