View Single Post
Old 12-11-2021, 09:04 AM   #4
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 207
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
As a quick solution, I downloaded the trialware of Adobe Acrobat Pro DC, and manually removed all occurences of the offending string — Unless I missed it, that application doesn't seem to have a search+replace feature.

I'll keep investigating, though, since it could come handy.

For some reason, PyPDF2 fails finding the string:
Code:
#pip install PyPDF2
import PyPDF2, re

INPUTFILE = "input.pdf"
String = "Offending string"

object = PyPDF2.PdfFileReader(INPUTFILE)

NumPages = object.getNumPages()
for i in range(0, NumPages):
	PageObj = object.getPage(i)
	print("this is page " + str(i)) 
	Text = PageObj.extractText() 
	ResSearch = re.search(String, Text)
	print(ResSearch)
Shohreh is offline   Reply With Quote