MobileRead Forums - View Single Post - Crop all pages, and remove footnote added in some?

Shohreh · 12-11-2021, 10:04 AM

As a quick solution, I downloaded the trialware of Adobe Acrobat Pro DC, and manually removed all occurences of the offending string — Unless I missed it, that application doesn't seem to have a search+replace feature.

I'll keep investigating, though, since it could come handy.

For some reason, PyPDF2 fails finding the string:

Code:

#pip install PyPDF2
import PyPDF2, re

INPUTFILE = "input.pdf"
String = "Offending string"

object = PyPDF2.PdfFileReader(INPUTFILE)

NumPages = object.getNumPages()
for i in range(0, NumPages):
	PageObj = object.getPage(i)
	print("this is page " + str(i)) 
	Text = PageObj.extractText() 
	ResSearch = re.search(String, Text)
	print(ResSearch)

12-11-2021, 10:04 AM	#4
Shohreh Addict Posts: 224 Karma: 304158 Join Date: Jan 2016 Location: France Device: none	As a quick solution, I downloaded the trialware of Adobe Acrobat Pro DC, and manually removed all occurences of the offending string — Unless I missed it, that application doesn't seem to have a search+replace feature. I'll keep investigating, though, since it could come handy. For some reason, PyPDF2 fails finding the string: Code: #pip install PyPDF2 import PyPDF2, re INPUTFILE = "input.pdf" String = "Offending string" object = PyPDF2.PdfFileReader(INPUTFILE) NumPages = object.getNumPages() for i in range(0, NumPages): PageObj = object.getPage(i) print("this is page " + str(i)) Text = PageObj.extractText() ResSearch = re.search(String, Text) print(ResSearch)