Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 03-18-2010, 12:42 AM   #1
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
Auto crop raster image PDF - any software?

Not sure this question has been discussed.

I have quite a number of PDF files which are not text; created by scanning from real paper document (legal documents). The page size is too big for viewing comfortably on my Nook.

As the page format is consistent on every page, is it possible to automatically crop each page by specifying the area we want to keep? Which software is able to do it easily?

I attach my sample document for your view.
Attached Thumbnails
Click image for larger version

Name:	crop.jpg
Views:	1044
Size:	70.5 KB
ID:	48289  

Last edited by bthoven; 03-18-2010 at 12:47 AM.
bthoven is offline   Reply With Quote
Old 03-18-2010, 10:09 AM   #2
ATimson
Evangelist
ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.ATimson ought to be getting tired of karma fortunes by now.
 
ATimson's Avatar
 
Posts: 456
Karma: 1044878
Join Date: Apr 2009
Device: Kindle Paperwhite 4
It's not free software, but IIRC Adobe Acrobat Professional (and possibly Standard - certainly not Adobe Reader, though) can do what you want.
ATimson is offline   Reply With Quote
Advert
Old 03-18-2010, 04:13 PM   #3
CoolDragon
Addict
CoolDragon doesn't litterCoolDragon doesn't litter
 
Posts: 244
Karma: 124
Join Date: Feb 2010
Device: none
For scanned PDF file cropping:

Non-free software: Acrobat Professional, Foxit PDF Editor etc etc

Free: Try this http://code.activestate.com/recipes/...le-with-pypdf/
CoolDragon is offline   Reply With Quote
Old 03-18-2010, 10:23 PM   #4
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
Quote:
Originally Posted by CoolDragon View Post
Hi,

Thanks for the suggestion.

I've followed the above link and I'm not sure about the margin parameters. What is the unit of the margin parameter? For example:

pdf-crop.py" -m "120 50 120 100" -i mypdf.pdf

What is the unit of 120 50 120 100?
bthoven is offline   Reply With Quote
Old 03-19-2010, 12:17 AM   #5
CoolDragon
Addict
CoolDragon doesn't litterCoolDragon doesn't litter
 
Posts: 244
Karma: 124
Join Date: Feb 2010
Device: none
Actually I don't know the unit. But I always start from 10 and adjust afterwards.
CoolDragon is offline   Reply With Quote
Advert
Old 03-19-2010, 12:30 AM   #6
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
I haven't tried pdf-crop.py, but you can do something similar to pdf-crop.py using the pdfmanipulate command line program that comes with calibre. The command is this:

pdfmanipulate crop -o "Myfile-cropped.pdf" -x 72 -y 72 -w 72 -v 72 "Myfile.pdf"

Where the filename following -o is what the file will be saved as, the filename at the end is the input file name, and the -x, -y, -w and -v are the number of pixels you want to crop from the left, bottom, right and top, respectively. (I hope that's right... it might not be, I haven't checked.)

Typically, there are 72 pixels per inch.

In windows you may need to put in the full path to pdfmanipulate, i.e.:

32 bit Windows:
Quote:
"C:\Program Files\Calibre2\pdfmanipulate.exe" crop -o "Myfile-cropped.pdf" -x 72 -y 72 -w 72 -v 72 "Myfile.pdf"
64 bit Windows:
Quote:
"C:\Program Files (x86)\Calibre2\pdfmanipulate.exe" crop -o "Myfile-cropped.pdf" -x 72 -y 72 -w 72 -v 72 "Myfile.pdf"
I posted more detailed instructions in this thread for using this for cropping all the PDFs in a folder at once with a batch file for Windows and Linux.

I gave instructions both for manually setting the dimensions to crop, and for using Ghostscript to auto-calculate the amount to crop (though that wouldn't work so well for scanned PDFs unless they're exceptionally clean).

You could also try PaperCrop and PDFLRF, which work well more or less automatically with scanned documents. (For the latter, you could use calibre to convert lrf to epub or whatever afterwards.)

Do not use Acrobat for this. Acrobat does not actually crop files. It just pretends to. I.e., it inserts a command to tell its viewer and Adobe Reader to ignore parts of the margins. But these commands are often ignored by reader software, which may well be true of the Nook.

Last edited by frabjous; 03-19-2010 at 12:36 AM.
frabjous is offline   Reply With Quote
Old 03-23-2010, 04:54 AM   #7
eksor
Connoisseur
eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.eksor ought to be getting tired of karma fortunes by now.
 
eksor's Avatar
 
Posts: 94
Karma: 999884
Join Date: Jun 2009
Device: prs700, i-mate JAMin, smartq v7, GeeksPhone Zero, iPad 3rd Gen
Quote:
Originally Posted by bthoven View Post
Not sure this question has been discussed.

I have quite a number of PDF files which are not text; created by scanning from real paper document (legal documents). The page size is too big for viewing comfortably on my Nook.

As the page format is consistent on every page, is it possible to automatically crop each page by specifying the area we want to keep? Which software is able to do it easily?

I attach my sample document for your view.
Hi:

If you are trying to process scanned documents I think that the best way to handle them is filtering thru scan2pdf or scantailor (thanks frabjous for this). See this thread for a discussion on this topic.

Another poster in this forum (sorry, I can't remember who) suggest to convert them to lrf (sony propietary format, i know you have a nook) and then again to pdf with calibre. This is because the auto croping and spiting feature of pdflrf. It tries to remove the margins and split the pages where there is no text.

Finally, I think that the command line tool suggested by frabjous is perhaps the most useful way of cropping efficiently.

Regards
eksor is offline   Reply With Quote
Old 03-23-2010, 05:14 AM   #8
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
Thanks a lot frabjous and eksor.

I've tried the pdfmanipulate.exe in Calibre. It works!

The correct margin parameters are:

x = left
v = right
w = top
y = bottom

Regarding the unit, if I want to cut the margin by 1 inch, I have to specify 72.

Thanks again.

Last edited by bthoven; 03-23-2010 at 05:47 AM. Reason: make correction
bthoven is offline   Reply With Quote
Old 03-25-2010, 06:25 AM   #9
slex
Addict
slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.
 
Posts: 294
Karma: 1196776
Join Date: Nov 2008
Location: Bulgaria
Device: Kindle 4 NT, Onyx Boox M92
If you use Linux you can try pdfshuffler. It is a GUI tool and you can see the portion of the page you crop.
slex is offline   Reply With Quote
Old 03-25-2010, 06:51 AM   #10
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
Quote:
Originally Posted by slex View Post
If you use Linux you can try pdfshuffler. It is a GUI tool and you can see the portion of the page you crop.
Wow! That's really great, wysiwyg!.

Any similar Windows version around?
bthoven is offline   Reply With Quote
Old 03-26-2010, 08:49 PM   #11
slex
Addict
slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.slex ought to be getting tired of karma fortunes by now.
 
Posts: 294
Karma: 1196776
Join Date: Nov 2008
Location: Bulgaria
Device: Kindle 4 NT, Onyx Boox M92
Quote:
Originally Posted by bthoven View Post
Wow! That's really great, wysiwyg!.

Any similar Windows version around?
http://www.pdfill.com/pdf_tools_free.html

I am not sure that it crops image pdfs, however.

Last edited by slex; 03-26-2010 at 08:51 PM.
slex is offline   Reply With Quote
Old 03-27-2010, 06:02 AM   #12
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
Hi frabjous
pdfmanipilate crop also pretend as if the doc were cropped. When I view the cropped file on Nook with Small font setting, my nook display the cropped pages; but to my surprise, my Nook displays full pages when set font size to medium or big. I'm fine with this because the cropped file size is not bigger than the original file.
bthoven is offline   Reply With Quote
Old 03-27-2010, 10:29 AM   #13
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Quote:
Originally Posted by bthoven View Post
Hi frabjous
pdfmanipilate crop also pretend as if the doc were cropped.
I don't know what's going on with your Nook, but this is not true.

Quote:
When I view the cropped file on Nook with Small font setting, my nook display the cropped pages; but to my surprise, my Nook displays full pages when set font size to medium or big. I'm fine with this because the cropped file size is not bigger than the original file.
Please note that this thread is about *rasterized* PDFs, not text-based PDFs. There are no "fonts" involved whose sizes can be changed. I really don't know what Nook does when you change the font size with raster PDFs, since I don't have a Nook, but it certainly might need to change the size of the bounding box in order to get the proportions right for the nook.
frabjous is offline   Reply With Quote
Old 03-27-2010, 11:07 AM   #14
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 475
Karma: 590
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle Paperwhite
yes..I'm talking about rasterized pdf; otherwise, I'll use sopdf instead.

Nook actually use Adobe ADE reader to display pdf. If the pdf is text, then changing font size to bigger ones will start to reflow the text.

I would confirm that my file is a rasterized one; and I was really surprised when I saw my cropped pdf displayed full page as if it were not cropped, when I set Nook font size to smaller, medium, bigger, or biggest.

In Nook, setting font size to small will display pdf in its original form, in other words, in what you intend your pdf file to be displayed (in this case, display only the cropped area). Setting font size to others, ie, smaller, medium, bigger, or biggest will display quite unpredictable result (in this case, display its before-cropped pages).

Last edited by bthoven; 03-27-2010 at 11:10 AM.
bthoven is offline   Reply With Quote
Old 07-24-2010, 04:07 AM   #15
mh445
Enthusiast
mh445 began at the beginning.
 
Posts: 46
Karma: 10
Join Date: Mar 2010
Device: none
The thread is older, but for cropping rasterized (scanned) PDFs, there is this nice little piece of software: http://sourceforge.net/projects/briss/
mh445 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Auto-creation of cover image on .epub to .mobi conversion december Calibre 4 02-10-2012 05:31 PM
Troubleshooting Is there any possibility to disable the auto-crop feature on Kindle dx? itistheway Amazon Kindle 0 06-22-2010 12:29 AM
Any FREE software to crop PDF pages droople PDF 26 05-09-2010 02:13 PM
Simple Auto Crop App Kneeded dioib PDF 4 02-16-2010 07:45 PM
Crop PDF. astra PDF 2 02-01-2009 04:03 PM


All times are GMT -4. The time now is 07:33 PM.


MobileRead.com is a privately owned, operated and funded community.