View Single Post
Old 09-23-2008, 11:21 AM   #6
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
sjvr767: I was going to work on this very idea, but you beat me to it

First, there are two very good projects which already implement this: pdfcrop and pdfcrop.pl (the latter has a very good fork at pdfcrop2). All of them have the same disadvantage: they detect the bounding box using ghostscript (which is very good and accurate) but then they don't update the PDF in-place: they re-create the PDF using pdftex or other software.

I'd already done a proof-of-concept that it worked using pyPdf [I've contributed to it in the past] but other projects (notably ebookutils) took my time Would you be interested in taking it further using gs? The command line to generate a bbox is
Code:
gs -dBATCH -dSAFER -dNOPAUSE -dUseCropBox -sDEVICE=bbox <input.pdf>
You can capture the output using the subprocess module and then use it for setting the cropbox.

EDIT: just saw that you posted this much earlier. My apologies
ashkulz is offline   Reply With Quote