MobileRead Forums - View Single Post

tuxor · 03-05-2012, 05:21 PM

Okay, since the way I did it seems to work, I will also contribute the small bash script that I wrote to get the png-pdf-version:

Code:

#!/bin/bash
for i in {1..416}
do
   j=$(printf %03d $i)
   pdfimages -j -f $i -l $i $1 __tmpfile
   rm -f __tmpfile*.ppm
   convert -negate __tmpfile*.pbm __tmpimg$j.png
   rm -f __tmpfile*.pbm
   convert __tmpimg$j.png __tmpimg$j.pdf
   rm -f __tmpimg*.png
done
pdftk __tmpimg*.pdf cat output output.pdf
rm -f __tmpimg*.pdf

This script needs the path to the input pdf as argument and will write to "output.pdf" in the working directory. The final pdf will be appx 54MB and the procedure will take really long and use a lot of cpu power. The same script probably won't work with most other pdfs, but there's a good chance it will work with some of the pdfs on archive.org that stem from the same ocr software.

Unfortunately, if you are on windows, there is no way of using this script. But I uploaded the whole converted file and will send the link via pm on request.

03-05-2012, 05:21 PM	#13
tuxor Addict Posts: 320 Karma: 99999 Join Date: Oct 2011 Location: Germany Device: Onyx Boox M92, Icarus Illumina E653	Okay, since the way I did it seems to work, I will also contribute the small bash script that I wrote to get the png-pdf-version: Code: #!/bin/bash for i in {1..416} do j=$(printf %03d $i) pdfimages -j -f $i -l $i $1 __tmpfile rm -f __tmpfile.ppm convert -negate __tmpfile.pbm __tmpimg$j.png rm -f __tmpfile.pbm convert __tmpimg$j.png __tmpimg$j.pdf rm -f __tmpimg.png done pdftk __tmpimg.pdf cat output output.pdf rm -f __tmpimg.pdf This script needs the path to the input pdf as argument and will write to "output.pdf" in the working directory. The final pdf will be appx 54MB and the procedure will take really long and use a lot of cpu power. The same script probably won't work with most other pdfs, but there's a good chance it will work with some of the pdfs on archive.org that stem from the same ocr software. Unfortunately, if you are on windows, there is no way of using this script. But I uploaded the whole converted file and will send the link via pm on request.