View Single Post
Old 09-04-2020, 11:57 PM   #5
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 937
Karma: 53902736
Join Date: Jun 2015
Device: multiple
A Mac-specific implementation, optimizing for the Kindle Dx. It works in Mojave. I'm not sure if it will work in Catalina due to Apple's ongoing cuts to Automator:

1. Install BenWiggy's PDFsuite, pypy, pyobjc for python 2, ghostscript, k2pdfopt, cpdf, and qpdf.

2. Open Automator and create a new App.

3. Run Shell script, 7 times, using Bash, and passing input as arguments. By splitting this into 7 shells scripts, we can help make sure the Mac finishes each step before starting the next. You'll need to substitute your preferred location for your K2pdfopt app, for some other apps, and for your Splice folder. I don't think the export code above will be suitable with so many short scripts.

for f in "$@"
do
# Strip any table of contents and fit text to page sizes to avoid ay scaling issues
/usr/local/bin/python /Users/Marja/Library/Services/quartzfilter.py "$f" "/System/Library/Filters/Lightness Increase.qfilter "/Users/Marja/Splice/Light.pdf"
done

for f in "$@"
do
# Copy and Rasterize 1st page from source pdf using k2pdfopt
~/Applications/k2pdfopt -ui -mode copy -dev dx -p 1 -x -o "/Users/Marja/Splice/DxCover_dx.pdf" "/Users/Marja/Splice/Light.pdf" $@
done

for f in "$@"
do
# Copy images from same source pdf file using Ghostscript, rasterize images using K2pdfopt
# - and -_ indicate standard output and input
# Due to compatibility issues, dumping to ~/Splice/Images.pdf
/usr/local/bin/gs -sDEVICE=pdfimage24 -dFILTERTEXT -dCompatibilityLevel=1.4\
-g800x1080 -r150 -dPDFFitPage \
-sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile="/Users/Marja/Splice/Images.pdf" "/Users/Marja/Splice/Light.pdf"
done

for f in "$@"
do
# Copy text from source pdf file using Ghostscript, turn text black using Cpdf
# The color conversion strategy should help with the 2nd stage if I switch to Ghostscript
# Due to compatibility issues, dumping to ~/Splice/Text.pdf
/usr/local/bin/gs -sDEVICE=pdfwrite -dFILTERIMAGE -dFILTERVECTOR -dCompatibilityLevel=1.4 -sColorConversionStrategy=RGB -sstdout=%sstderr -dNOPAUSE -dQUIET -dBATCH -sOutputFile="/Users/Marja/Splice/Text.pdf" "/Users/Marja/Splice/Light.pdf"
done

for f in "$@"
do
# Copy images from same source pdf file using Ghostscript, rasterize images using K2pdfopt
# - and -_ indicate standard output and input
# Due to compatibility issues, dumping to ~/Splice/Images.pdf
~/Applications/k2pdfopt -ui -mode copy -dev dx -x -o "/Users/Marja/Splice/DxImages_dx.pdf" "/Users/Marja/Splice/Images.pdf" $@
done

for f in "$@"
do
# Copy text from source pdf file using Ghostscript, turn text black using Cpdf
# The color conversion strategy should help with the 2nd stage if I switch to Ghostscript
# Due to compatibility issues, dumping to ~/Splice/Text.pdf
/usr/local/bin/cpdf "/Users/Marja/Splice/Text.pdf" -blacktext -o "/Users/Marja/Splice/Blacktext.pdf"
done

for f in "$@"
do
# Splice files using qpdf and date so new runs won't overwrite old ones
/usr/local/bin/qpdf --collate "/Users/Marja/Splice/DxCover_dx.pdf" --pages "/Users/Marja/Splice/DxCover_dx.pdf" "/Users/Marja/Splice/DxImages_dx.pdf" "/Users/Marja/Splice/Blacktext.pdf" -- /Users/Marja/Splice/"SplicedDx$(date "+%Y.%m.%d-%H.%M.%S").pdf"
done

The 3rd shell script can take a long while.

I've experimented with the PDFSuite 150 and 300 dpi filters, but depending on the source pdfs these often crash due to memory pressure. Even this version will occasionally crash.

I've not been able to keep the original filename as an element in the final one.
MarjaE is offline   Reply With Quote