View Full Version : Determining if a PDF File is encrypted - the hard way


Jim Lester
02-05-2009, 02:05 AM
Because daesdaemar asked me to....

At the simplest level a PDF file is a collection of numbered objects. The objects themselves will tend to be streams of data ( such as the text run, graphics, or fonts) or dictionaries (structures which contained formatted data that describe the document).

If you look at PDF file you'll tend to see a lot of the following:
1 0 obj ...object data...
2 0 obj ...object data...
3 0 obj ...object data...

The objects themselves are organized by a cross reference table which lists the offsets for the document, and also has a trailer dictionary which list the object numbers important objects. One of those objects is the encryption dictionary, which must exist if the document is encrypted.

So, opening the PDF file in a good text editor (on the Mac I really like BareBones' TextWrangler) search for the string "/Encrypt" which if the file is encrypted should take you to the trailer dictionary which will look something like


<</Size 1910/Root 1426 0 R/Encrypt 1908 0 R/Info 1421 0 R/ID[<5081abdd2bdafc8f4748b307de4c8eff><29c91bdce6e8e5478905ab5d6890c7eb>]>>


If all you want to know is if the PDF file is encrypted, you can stop here - it is encrypted. If you want to know more about how it is encrypted you'll need to find the "Encryption Dictionary", so you'll want to note the object number from the trailer dictionary (1908 in the case above), and search for that object by starting at the beginning of the file and looking for "<object number> 0 obj" (ie "1908 0 obj")


1908 0 obj<</Filter/EBX_HANDLER/V 1/EBX_TITLE(Book Title)/>>
endobj


Of the most interest in the encryption dictionary will be the /Filter value, which will tell you which security handler was used to encrypt the file. Common values are:


Standard - Password Security
EBX_HANDLER - Adobe Content Server (aka SecurePDF - most eBooks)
PubSec - Certificate Security


For more information about the PDF File format get the PDF File Reference from http://www.adobe.com/devnet/pdf

You can look at section 3.5 in particular which contains the information about Encryption in PDFs

caterpillar
08-28-2009, 06:25 PM
Sorry for wrong thread!

CeramicWeasel
07-05-2010, 12:10 AM
Apologies for resurrecting this old topic, but i've found it extremely useful for some software development I've been doing and I had a question if anyone is able to answer it.

Is there a way to tell if a PDF document has been given an owner password but not a user password? I don't care what the passwords are, or about the content of the PDF, but I do need to know what *type* of password protections are on it.

It seems that encrypted PDFs get both an /O and /U entry in the /Filter section regardless of the type of password in use.

Jellby
07-05-2010, 06:10 AM
Maybe pdftk or pdfinfo can help.

Jim Lester
07-12-2010, 04:20 PM
Is there a way to tell if a PDF document has been given an owner password but not a user password? I don't care what the passwords are, or about the content of the PDF, but I do need to know what *type* of password protections are on it.

It seems that encrypted PDFs get both an /O and /U entry in the /Filter section regardless of the type of password in use.

If there is no user (/U) password the pad string is used in it's entirety. So if you generate the /U value using just the pad string (see algorithms 3.2 and 3.4 in the PDF spec), and it is the same as the /U value in the encrypt dict, then there is no user password.

Note that the algorithms will be slightly different for AES256 encrypted documents (came along with Acrobat 9).

joblack
07-12-2010, 05:37 PM
If there is no user (/U) password the pad string is used in it's entirety. So if you generate the /U value using just the pad string (see algorithms 3.2 and 3.4 in the PDF spec), and it is the same as the /U value in the encrypt dict, then there is no user password.

Note that the algorithms will be slightly different for AES256 encrypted documents (came along with Acrobat 9).

The ineptpdf scripts can also decrypt password protected pdfs - no biggy

Jim Lester
07-16-2010, 11:37 AM
The ineptpdf scripts can also decrypt password protected pdfs - no biggy

Yes but that doesn't help much unless Ceramic Weasel is is programming in python.

joblack
07-19-2010, 07:22 AM
Yes but that doesn't help much unless Ceramic Weasel is is programming in python.

Even if he programs in another programming language he can transfer the knowledge and routines from the Python scripts. He doesn't have to re-invent the wheel ^^.