03-18-2010, 12:42 AM | #1 |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
Auto crop raster image PDF - any software?
Not sure this question has been discussed.
I have quite a number of PDF files which are not text; created by scanning from real paper document (legal documents). The page size is too big for viewing comfortably on my Nook. As the page format is consistent on every page, is it possible to automatically crop each page by specifying the area we want to keep? Which software is able to do it easily? I attach my sample document for your view. Last edited by bthoven; 03-18-2010 at 12:47 AM. |
03-18-2010, 10:09 AM | #2 |
Evangelist
Posts: 456
Karma: 1044878
Join Date: Apr 2009
Device: Kindle Paperwhite 4
|
It's not free software, but IIRC Adobe Acrobat Professional (and possibly Standard - certainly not Adobe Reader, though) can do what you want.
|
Advert | |
|
03-18-2010, 04:13 PM | #3 |
Addict
Posts: 244
Karma: 124
Join Date: Feb 2010
Device: none
|
For scanned PDF file cropping:
Non-free software: Acrobat Professional, Foxit PDF Editor etc etc Free: Try this http://code.activestate.com/recipes/...le-with-pypdf/ |
03-18-2010, 10:23 PM | #4 | |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
Quote:
Thanks for the suggestion. I've followed the above link and I'm not sure about the margin parameters. What is the unit of the margin parameter? For example: pdf-crop.py" -m "120 50 120 100" -i mypdf.pdf What is the unit of 120 50 120 100? |
|
03-19-2010, 12:17 AM | #5 |
Addict
Posts: 244
Karma: 124
Join Date: Feb 2010
Device: none
|
Actually I don't know the unit. But I always start from 10 and adjust afterwards.
|
Advert | |
|
03-19-2010, 12:30 AM | #6 | ||
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
I haven't tried pdf-crop.py, but you can do something similar to pdf-crop.py using the pdfmanipulate command line program that comes with calibre. The command is this:
pdfmanipulate crop -o "Myfile-cropped.pdf" -x 72 -y 72 -w 72 -v 72 "Myfile.pdf" Where the filename following -o is what the file will be saved as, the filename at the end is the input file name, and the -x, -y, -w and -v are the number of pixels you want to crop from the left, bottom, right and top, respectively. (I hope that's right... it might not be, I haven't checked.) Typically, there are 72 pixels per inch. In windows you may need to put in the full path to pdfmanipulate, i.e.: 32 bit Windows: Quote:
Quote:
I gave instructions both for manually setting the dimensions to crop, and for using Ghostscript to auto-calculate the amount to crop (though that wouldn't work so well for scanned PDFs unless they're exceptionally clean). You could also try PaperCrop and PDFLRF, which work well more or less automatically with scanned documents. (For the latter, you could use calibre to convert lrf to epub or whatever afterwards.) Do not use Acrobat for this. Acrobat does not actually crop files. It just pretends to. I.e., it inserts a command to tell its viewer and Adobe Reader to ignore parts of the margins. But these commands are often ignored by reader software, which may well be true of the Nook. Last edited by frabjous; 03-19-2010 at 12:36 AM. |
||
03-23-2010, 04:54 AM | #7 | |
Connoisseur
Posts: 94
Karma: 999884
Join Date: Jun 2009
Device: prs700, i-mate JAMin, smartq v7, GeeksPhone Zero, iPad 3rd Gen
|
Quote:
If you are trying to process scanned documents I think that the best way to handle them is filtering thru scan2pdf or scantailor (thanks frabjous for this). See this thread for a discussion on this topic. Another poster in this forum (sorry, I can't remember who) suggest to convert them to lrf (sony propietary format, i know you have a nook) and then again to pdf with calibre. This is because the auto croping and spiting feature of pdflrf. It tries to remove the margins and split the pages where there is no text. Finally, I think that the command line tool suggested by frabjous is perhaps the most useful way of cropping efficiently. Regards |
|
03-23-2010, 05:14 AM | #8 |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
Thanks a lot frabjous and eksor.
I've tried the pdfmanipulate.exe in Calibre. It works! The correct margin parameters are: x = left v = right w = top y = bottom Regarding the unit, if I want to cut the margin by 1 inch, I have to specify 72. Thanks again. Last edited by bthoven; 03-23-2010 at 05:47 AM. Reason: make correction |
03-25-2010, 06:25 AM | #9 |
Addict
Posts: 294
Karma: 1196776
Join Date: Nov 2008
Location: Bulgaria
Device: Kindle 4 NT, Onyx Boox M92
|
If you use Linux you can try pdfshuffler. It is a GUI tool and you can see the portion of the page you crop.
|
03-25-2010, 06:51 AM | #10 |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
|
03-26-2010, 08:49 PM | #11 | |
Addict
Posts: 294
Karma: 1196776
Join Date: Nov 2008
Location: Bulgaria
Device: Kindle 4 NT, Onyx Boox M92
|
Quote:
I am not sure that it crops image pdfs, however. Last edited by slex; 03-26-2010 at 08:51 PM. |
|
03-27-2010, 06:02 AM | #12 |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
Hi frabjous
pdfmanipilate crop also pretend as if the doc were cropped. When I view the cropped file on Nook with Small font setting, my nook display the cropped pages; but to my surprise, my Nook displays full pages when set font size to medium or big. I'm fine with this because the cropped file size is not bigger than the original file. |
03-27-2010, 10:29 AM | #13 | ||
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
Quote:
Quote:
|
||
03-27-2010, 11:07 AM | #14 |
Evangelist
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
|
yes..I'm talking about rasterized pdf; otherwise, I'll use sopdf instead.
Nook actually use Adobe ADE reader to display pdf. If the pdf is text, then changing font size to bigger ones will start to reflow the text. I would confirm that my file is a rasterized one; and I was really surprised when I saw my cropped pdf displayed full page as if it were not cropped, when I set Nook font size to smaller, medium, bigger, or biggest. In Nook, setting font size to small will display pdf in its original form, in other words, in what you intend your pdf file to be displayed (in this case, display only the cropped area). Setting font size to others, ie, smaller, medium, bigger, or biggest will display quite unpredictable result (in this case, display its before-cropped pages). Last edited by bthoven; 03-27-2010 at 11:10 AM. |
07-24-2010, 04:07 AM | #15 |
Enthusiast
Posts: 46
Karma: 10
Join Date: Mar 2010
Device: none
|
The thread is older, but for cropping rasterized (scanned) PDFs, there is this nice little piece of software: http://sourceforge.net/projects/briss/
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Auto-creation of cover image on .epub to .mobi conversion | december | Calibre | 4 | 02-10-2012 05:31 PM |
Troubleshooting Is there any possibility to disable the auto-crop feature on Kindle dx? | itistheway | Amazon Kindle | 0 | 06-22-2010 12:29 AM |
Any FREE software to crop PDF pages | droople | 26 | 05-09-2010 02:13 PM | |
Simple Auto Crop App Kneeded | dioib | 4 | 02-16-2010 07:45 PM | |
Crop PDF. | astra | 2 | 02-01-2009 04:03 PM |