View Single Post
Old 02-05-2009, 02:05 AM   #1
Jim Lester
Evangelist
Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.Jim Lester is less competitive than you.
 
Jim Lester's Avatar
 
Posts: 416
Karma: 14682
Join Date: May 2008
Location: SF Bay Area
Device: Nook HD, Nook for Windows 8
Determining if a PDF File is encrypted - the hard way

Because daesdaemar asked me to....

At the simplest level a PDF file is a collection of numbered objects. The objects themselves will tend to be streams of data ( such as the text run, graphics, or fonts) or dictionaries (structures which contained formatted data that describe the document).

If you look at PDF file you'll tend to see a lot of the following:
1 0 obj ...object data...
2 0 obj ...object data...
3 0 obj ...object data...

The objects themselves are organized by a cross reference table which lists the offsets for the document, and also has a trailer dictionary which list the object numbers important objects. One of those objects is the encryption dictionary, which must exist if the document is encrypted.

So, opening the PDF file in a good text editor (on the Mac I really like BareBones' TextWrangler) search for the string "/Encrypt" which if the file is encrypted should take you to the trailer dictionary which will look something like

Code:
<</Size 1910/Root 1426 0 R/Encrypt 1908 0 R/Info 1421 0 R/ID[<5081abdd2bdafc8f4748b307de4c8eff><29c91bdce6e8e5478905ab5d6890c7eb>]>>
If all you want to know is if the PDF file is encrypted, you can stop here - it is encrypted. If you want to know more about how it is encrypted you'll need to find the "Encryption Dictionary", so you'll want to note the object number from the trailer dictionary (1908 in the case above), and search for that object by starting at the beginning of the file and looking for "<object number> 0 obj" (ie "1908 0 obj")

Code:
1908 0 obj<</Filter/EBX_HANDLER/V 1/EBX_TITLE(Book Title)/>>
endobj
Of the most interest in the encryption dictionary will be the /Filter value, which will tell you which security handler was used to encrypt the file. Common values are:
  • Standard - Password Security
  • EBX_HANDLER - Adobe Content Server (aka SecurePDF - most eBooks)
  • PubSec - Certificate Security

For more information about the PDF File format get the PDF File Reference from http://www.adobe.com/devnet/pdf

You can look at section 3.5 in particular which contains the information about Encryption in PDFs

Last edited by Jim Lester; 02-05-2009 at 02:18 AM.
Jim Lester is offline   Reply With Quote