View Single Post
Old 05-08-2017, 07:11 PM   #1
jgray
Fanatic
jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.jgray ought to be getting tired of karma fortunes by now.
 
Posts: 554
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
Converting a PDF to B&W

I recently downloaded a scanned book where all the pages were brown from age. Why this particular book was scanned in color, I don't know. It had no images and no color, other than the browned pages.

I did some experimenting with ImageMagick and here are my results. First, you need Imagemagick and Ghostscript installed (both Open Source). Open a command prompt and from the ImageMagick folder, run the "convert.exe" program.

To convert the PDF to a B&W TIFF, use this command:

convert -density 288 book.pdf -threshold 50% -type bilevel -despeckle -resample 96 book.tif

You can play with the "density" value. The "resample" parameter downsamples the final TIFF to 96 DPI, which is good for on-screen viewing. You can omit it, if you like. Try to keep the density number four times the resample number, when resampling.

To recreate a PDF from the TIFF, using JPEG compression:

convert book.tif -compress jpeg book-bw.pdf

Note that the output PDF name is different from the original, to prevent overwriting.

You can optionally add a "-quality nn" parameter to adjust the JPEG compression.

I had pretty decent results with this. One peculiarity, however. The original color PDF is half the size of my final B&W PDF, using the commands above. I dont' know what compression was used in the original, however.

Note that since you are converting the original PDF to TIFF images, you will lose all OCR'ed text. I OCR'ed it again and all was fine.
jgray is offline   Reply With Quote