Here's a little Bash script that will convert all PDFs in the current folder first to PGMs with pdftoppm then run this algorithm on the PGMs, make a small white border around the text (I like to have a small border), crop every images to three overlapping images (it's not exactly what I had in mind in the above post, but it'll work good enough in most cases) and finally convert it back to a new PDF.
Requirements: pi, pdftoppm (part of Poppler and/or Xpdf), ImageMagick (v6.3.2 or newer is needed for the -extent option to work properly), libtiff
Code:
#!/bin/bash
set -e
for i in *.pdf; do
if [ -f "$i" ]
then
echo "Converting file \"$i\". Please wait ..."
PDFName="`basename "$i" .pdf`"
mkdir "Temp-$PDFName"
cd "Temp-$PDFName"
pdftoppm -r 180 -gray "../$i" "$PDFName"
for i in *.pgm; do
pi "$i" "New-$i"
rm "$i"
done
for i in *.pgm; do
convert "$i" +compress -gravity Center -extent "106%x101%" -gravity East -extent "104%x100%" "`basename "$i" .pgm`.tif"
rm "$i"
done
for i in *.tif; do
convert "$i" +compress -gravity North -crop "100%x34%" +repage -depth 8 "`basename "$i" .tif`-1.tif"
convert "$i" +compress -gravity Center -crop "100%x34%" +repage -depth 8 "`basename "$i" .tif`-2.tif"
convert "$i" +compress -gravity South -crop "100%x34%" +repage -depth 8 "`basename "$i" .tif`-3.tif"
rm "$i"
done
tiffcp *.tif "New-$PDFName.tif"
tiff2pdf -z "New-$PDFName.tif" -o "New-$PDFName.pdf" -t "$PDFName"
rm *.tif
mv "New-$PDFName.pdf" ../
cd ..
rmdir "Temp-$PDFName"
else
echo "ERROR: No PDF files found"
exit 1
fi
done
echo "Done."
exit 0