MobileRead Forums - View Single Post - Is there a way to detect buggy pdfs without manually checking each pdf?

MarjaE · 03-27-2020, 03:24 PM

Some pdfs have corrupt text encoding to begin with. I have a pre-process pdfs for my Kindle. Some pdfs end up with corrupt text encoding after pre-processing in Ghostscript.

If I select text from these pdfs, I get either gibberish, or blank spaces punctuated with ... well, occasional punctuation.

I usually find this out by trying to search in a pdf, or by selecting text in a pdf. Is there an easy way to detect pdfs with malformed or missing text, without manually opening and selecting passages from each pdf?

03-27-2020, 03:24 PM	#1
MarjaE Guru Posts: 942 Karma: 53902736 Join Date: Jun 2015 Device: multiple	Is there a way to detect buggy pdfs without manually checking each pdf? Some pdfs have corrupt text encoding to begin with. I have a pre-process pdfs for my Kindle. Some pdfs end up with corrupt text encoding after pre-processing in Ghostscript. If I select text from these pdfs, I get either gibberish, or blank spaces punctuated with ... well, occasional punctuation. I usually find this out by trying to search in a pdf, or by selecting text in a pdf. Is there an easy way to detect pdfs with malformed or missing text, without manually opening and selecting passages from each pdf? Last edited by MarjaE; 03-27-2020 at 04:07 PM.