View Single Post
Old 09-16-2025, 02:53 AM   #2140
Shohreh
Addict
Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.Shohreh ought to be getting tired of karma fortunes by now.
 
Posts: 219
Karma: 304158
Join Date: Jan 2016
Location: France
Device: none
Thanks. That did the trick to remove headers.

As for keeping footnotes at the bottom: When running pymupdf, I notice they are displayed in a smaller font size, possibly starting with a superscript (but not always: In this book, some footnotes start with a star followed by a superscript), so that would be a "simple" way to grab everything down to the end of the page, or move them all to the end of the chapter/book like Abbyy does and just include hyperlinks so the user can easily go back and forth.

Code:
blocks = page.get_text("dict", flags=11)["blocks"]
for b in blocks:  # iterate through the text blocks
	for l in b["lines"]:  # iterate through the text lines
		stuff = ""
		for s in l["spans"]:  # iterate through the text spans
			print("")
			#4.8 = 4.800000190734863
			if round(s["size"],1) == 4.8:
				print("Found footnote", s["text"])
			stuff += s["text"]
			print(stuff)
Attached Thumbnails
Click image for larger version

Name:	7BEC5927-80FF-48FC-9D16-32169080FA77.png
Views:	17
Size:	61.4 KB
ID:	218118  
Shohreh is offline   Reply With Quote