View Single Post
Old 11-06-2015, 01:08 PM   #6
PHC
Member
PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.PHC is as sexy as a twisted cruller doughtnut.
 
Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
Quote:
Originally Posted by willus View Post
You might take a look at this post in the Briss thread, which talks about how to use Ghostscript to permanently remove cropped areas from a PDF.
I'll tell you, using GS for this purpose could be a nightmare. You have to fiddle with so many parameters to get a high quality result, especially on images. And even then it will not be lossless.

A far simpler procedure, that will result in lossless quality, is to split the document into pieces small enough for Acrobat to handle without hanging. I have had it hang when OCRing large documents. This despite the fact that I use an 8-core Xeon Mac Pro with 16GB of memory. Adobe really doesn't seem to understand resource management. So what I do is simply OCR 100 pages at a time and save incrementally. This works fine for OCR. Unfortunately, there is no way to choose page ranges when removing hidden information. It's all or nothing. So you have to split the PDF into manageable size documents.

The best tool I have found for this is cpdf. It is free, multiplatform, and works very well.

Assuming you have successfully cropped the entire document and saved it, to split it into say, 100-page documents:

Code:
cpdf -split "My cropped PDF.pdf" 1 -chunk 100 -o "@F %%%.pdf"
This splits it into files of, at most, 100 pages, named 'My cropped PDF 001.pdf', 'My cropped PDF 002.pdf', …

Now you can open each of them in Acrobat and remove the hidden information. After saving all the files, you can merge them:

Code:
cpdf -merge "My cropped PDF 001.pdf" "My cropped PDF 002.pdf" "My cropped PDF 003.pdf" -o "My cropped PDF clean.pdf"
'My cropped PDF clean.pdf' contains the whole document, cropped, with hidden information removed.

You can change the argument to -chunk to whatever number of pages is necessary to allow Acrobat to successfully remove the data and save it. You will have to experiment.
PHC is offline   Reply With Quote