Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 03-13-2018, 11:15 AM   #1
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
Mac to Kindle 2 and Other Older Readers

Hi,

I'm still working out how to handle pdfs. But I've had a lot of trial and error and I'd like to share.

First: Keep your originals. If you don't have enough disk space, I'd suggest storing some on an external drive, and setting Time Machine to back up the external drive as well as the main drive.

Second: Many pdfs encode images as jpeg2000. It takes less space than jpeg, but some Macs will take longer to load pages from these pdfs, and Kindle 2s and other older readers won't be able to load images. It can be particularly bad with scanned pdfs, where it means the older readers won't be able to load anything. I use Easyfind and search for "jpxdecode" in file contents, to identify files with jpeg2000 and other jpx images.

Apple changed their Quartz decoder in Sierra, so it's much more reluctant to convert jpeg2000 images in pdf files to jpeg images. You'll need other tools if you want to convert jpeg2000 images in pdf files to jpeg images.

My suggestions:

-- Willus's k2pdfopt-- http://www.willus.com/k2pdfopt/

-- Homebrew-- https://brew.sh/ unless you use MacPorts instead.

-- Ghostscript-- can be installed through Homebrew

-- rwts-pdfwriter-- https://github.com/rodyager/RWTS-PDFwriter

-- cpdf-- can be installed through Homebrew

-- ocrmypdf-- can be installed through Homebrew-- you may need to brew uninstall Tesseract and brew install --all-languages tesseract

-- Automator-- comes with your computer and can help avoid typing and retyping terminal commands.

My workflow, more or less:

First, do I need to ocr the text? That's often the case with scanned texts, and occasionally with other texts due to text encoding errors.

If I need to ocr the text, then I need to use either ocrmypdf or Elucidate. For whatever reason, the resulting files don't play well with Ghostscript, so I will need to use k2pdfopt on them.

If I don't need to ocr the text, then is it raster or vector? is any text pixelated?

If it's raster, and I don't mind more pixellation, don't mind losing colors, and don't mind resetting fold-out pages to the same size as other pages, then I can use k2pdfopt with decent compression.

If it's raster, and I do mind, I can use k2pdfopt without compression or ghostscript converting to pdf 1.4.

If it's vector, I suggest ghostscript converting to pdf 1.4.

My command-line codes:

For ocring text:

ocrmypdf -l lan --force-ocr input.pdf output.pdf

-l lan allows a 3-letter code to specify the language. If you skip this, it defaults to English.

--force-ocr overwrites existing text layers. If the file has a Google Books intro but no text layer afterwards, or the files has a bad text layer, this is useful.

input.pdf I tend to drag and drop from the Finder into the terminal window.

output.pdf It should appear in your user folder.

For k2pdfopt with compression:

k2pdfopt -mode copy -dev dx
input.pdf

-dev dx sets it to reformat everything for the Kindle dx. There are other codes for some other devices.

I hit enter after the codes here, and then drag and drop the input file into the k2 window.

The customization tools here are handy: http://www.willus.com/k2pdfopt/help/mac.shtml

For k2pdfopt without compression:

k2pdfopt -mode copy

I hit enter after the codes here, and then drag and drop the input file into the k2 window.

The customization tools here are handy: http://www.willus.com/k2pdfopt/help/mac.shtml

For ghostscript to convert:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

Output should appear in your user folder.

Modified from instructions here: http://www.spoonylife.org/level-3/co...to-1-5-1-6-etc

For Automator:

I haven't figured out how to use Automator with the other tools yet, but I use it to simplify that Ghostscript script.

I created an app with a single step: run shell script. "shell" is "/bin/bash" and "pass input" is "as arguments"; the actual code is:

for f in "$@"
do
suffix="-converted.pdf"
base=`basename "$f" .pdf`
outputfile=$base$suffix
/usr/local/bin/gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sstdout=%sstderr -dPDFSETTINGS=/screen -dNOPAUSE -dQUIET -dBATCH -sOutputFile="$outputfile" "$f"
done

I can just drag files onto the app icon and Ghostscript converts them to 1.4, converting any jpeg2000 images to jpeg.

Output should appear in your user folder.

Anyway, I hope this helps.

Last edited by MarjaE; 03-13-2018 at 11:17 AM.
MarjaE is offline   Reply With Quote
Old 03-14-2018, 01:47 PM   #2
MarjaE
Guru
MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.MarjaE ought to be getting tired of karma fortunes by now.
 
Posts: 924
Karma: 53902736
Join Date: Jun 2015
Device: multiple
Unfortunately, k2pdfopt sometimes drops and/or reorders ocr'd text. I think ocrmypdf -l lan --output-type pdfa-1 --force-ocr input.pdf output.pdf may be a better option, without running through k2 afterwards.
MarjaE is offline   Reply With Quote
Reply

Tags
kindle 2, mac, pdf


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Best UI on Older E-readers? Richwood Which one should I buy? 7 12-11-2017 11:10 AM
Hacks Where can I find older version of kindlegen(mac platform)? flyingfoxlee Amazon Kindle 1 12-02-2013 07:16 AM
Mac epub readers with smooth scrolling? apennebaker Reading and Management 2 08-26-2013 01:49 PM
Older version of Kindle for iPhone? HELP! allisondbl Amazon Kindle 0 05-24-2012 09:09 PM
Epub/Pdf readers that display text horizontally (on Mac OS X 10.6) ilovepurple2234 General Discussions 0 10-31-2011 10:50 AM


All times are GMT -4. The time now is 06:08 AM.


MobileRead.com is a privately owned, operated and funded community.