This happens when someone scans a book, OCRs it, and then converts it without removing page numbers. (I've seen it on Open Library ePubs; while they don't provide downloads on most books anymore, I always used to go for the PDFs because of this.)
As mentioned, regex should help.
Last edited by ownedbycats; 12-12-2022 at 03:31 PM.
|