View Full Version : PDF Trimmer


MichalMoskal
06-21-2009, 04:00 PM
Hi, I didn't do my homework, and instead of looking at this forum first, I implemented my own program for cropping PDFs. I'm not sure if any of the other programs can do what I needed.

New in version r3246: The margins are now automatically detected and the file is split on spaces between lines. So typically you just need to run it on a PDF file without any options.

It rotates the input, so you can read holding the device landscape, while the device thinks it is portrait. This is useful with Sony reader, which in landscape mode insists on having two virtual pages per one physical page.

It is implemented in Java (pure java), so should work on any system and comes with source code.

You can get the program at: http://nemerle.org/~malekith/pdftrim/

Example input: http://nemerle.org/~malekith/pdftrim/rocket.pdf
And output: http://nemerle.org/~malekith/pdftrim/chunked.pdf

From README:


Pdftrim is a hack to overcome problems with Sony PDF reading software in PRS505
e-book reader (in particular when reading in landscape mode, PRS505 might miss
some text in the middle of the document, as landscape mode insists on exactly
two virtual pages per page of document).

Pdftrim will trim margins of a PDF file, rotate it 90 degrees, and split into
chunks that will fit into an ebook reader screen. It will try to detect where
there is empty space in the text, so it is safe to split.

Pdftrim was motivated by a need to read scientific papers, which usually
do not withstand reflowing very well.

License: public domain for the source (Main.java)

Author: Michal Moskal <michal.moskal@gmail.com>

The itext ( http://www.lowagie.com/iText/ ) comes under MPL.
The PDF Renderer ( https://pdf-renderer.dev.java.net/ ) comes under LGPL.
The jopt-simple ( http://jopt-simple.sourceforge.net/ ) comes under MIT.

Usage:

Run pdftrim.sh or pdftrim.bat without arguments for command-line help.

Typical usage:

pdftrim.sh file.pdf

If you want to specify the dimensions by hand (for example to cut a header or
a footer), use:

pdftrim.sh file.pdf -l 64 -r 74 -t 60 -b 60 -f

inspect trimmed.pdf file, make sure the red box is where you want the document
to be trimmed, modify -l, -r, -t and -b if it's not. For LNCS formatted papers
use "-s lncs" or "-s lncs2"

If even and odd pages have different margins use the -even or -odd options to
set shifts, e.g.,:

pdftrim.sh file.pdf -l 64 -r 74 -t 60 -b 60 -odd 10 -f

Once you're done, remove -f option and possibly add author and title:

pdftrim.sh file.pdf -l 64 -r 74 -t 60 -b 60 -odd 10 -Author "Joe SixPack" -Title "Budvar"

The default output size is for Sony PRS505, can be changed with -w/-h.

Enjoy!
Michal
# vim: spell ft=text

frabjous
06-22-2009, 10:34 AM
Great. I'll have to try this out. Have you compared your source code with that of SoPDF and similar?

MichalMoskal
06-23-2009, 02:50 AM
Great. I'll have to try this out. Have you compared your source code with that of SoPDF and similar?

Not really. Before doing this, I did a general google search, and somehow assumed the pdflatex's pdfcrop and the pdftrim (or whatever) from calibre were my only options.

There isn't much source to talk about anyhow (250 lines), it's a 4h hack :-) But does exactly what I wanted.

If I get some free time, I might implement detection of margins and transferring of bookmarks (PDF document outline), but right now I do not have much incentive, the predefined lncs and lncs2 modes work fine for me.

Michal