View Single Post
Old 08-08-2019, 03:30 AM   #1
fredthefork
Junior Member
fredthefork began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Aug 2019
Device: Kindle Paperwhite
Cropping PDFs for EPUB conversion using BRISS, Ghostscript and/or Calibre

Hello! I'm new to this so please forgive me if this is basic knowledge.

I have a PDF file which is OCRed. I would like to convert it to epub. The main problem is that I'd like to crop my pdf so I do not have duplicate Headers or Page Numbers in my epub. I have tried first OSX's Preview, then Briss for that. I then tried to run it through calibre epub conversion. Didn'nt work. I then used ghostscript to extract the text:
Code:
gs -sDEVICE=txtwrite -o extractedText%d.txt input.pdf
- but this doesn't work either -still getting all the headers. Although the pdf is clearly cropped, the cropped content did not seem to get deleted permanently.

Then I read on here that

If you run the Briss PDF output through Ghostscript to generate a new PDF, I believe it will permanently get rid of the cropped-out material so that it won't come back in calibre.

This user suggested this command:
Code:
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf
. And although it does produce a pdf, running it through my first ghostscript command or through the standard calibre conversion is to no avail: Still get the headers & page numbers. I've also tried using different pdfs, just to be sure.

What am I missing here? This can't be so difficult, - can it?
fredthefork is offline   Reply With Quote