![]() |
#1 |
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Aug 2011
Device: Kindle
|
![]()
Hi guys,
This is my second post and I wanted to share an information I found lately. I wanted to remove the header and footer but couldn't manage it with Calibre since the header seemed intricate and very variable to me and tried to crop the pages with Acrobat and it seemed that it only hid the info not actually removing it. Lately I found that I can removed the hid info just like this: "Document -> examine document There you can delete the hidden information." Source: http://superuser.com/questions/12756...tly-in-acrobat It seemed helpful to me and I'm not gonna throw the pdf files that I cannot read on Kindle anymore. Good luck on that! (: |
![]() |
![]() |
![]() |
#2 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 23
Karma: 66956
Join Date: Feb 2010
Location: Conn. USA
Device: Kindle 3, Kindle PW
|
1. Open your pdf with Adobe Acrobat Pro.
2. Click Tools >> Pages >> Crop Set margins and crop document. You can use different page range, odd and event page settings. 3. Once you cropped your file Click Tools >> Protection >> Remove hidden information. 4. You will see Status: Finding hidden information, then Results . Once all the hidden information found, you can check/uncheck each group of information. 5. Click Remove. 6. Save your document. You have removed page numbers and headers and other information that you cropped out, permanently. Note: Dropping cropped pdf into Calibre directly may not yield good results. Especially if you have unicode characters on your pdf. Better option, first convert it html first. 7. Now, lets save our pdf as html before converting it into mobi or epub document. Click File >> Save As >> More Options >> Html Web Page If you cant save your file as html, make sure you unchecked "Run OCR if needed". For that, click "Settings" on "Save As" screen. You can do some manual fix befor conversion if you like. 8. Drag and drop html file into Calibre, and set TOC and other stuff and convert. https://www.mobileread.com/forums/sho...d.php?t=160755 |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
@sinan:
Brilliant solution. I have been searching and trying many different CLI tools but this worked where others utterly failed. Thanks! |
![]() |
![]() |
![]() |
#4 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3
Karma: 14228
Join Date: Oct 2015
Device: none
|
I know it's an old thread, but I have an issue with large documents that have a lot of cropped hidden information (where I'm extra interested in getting rid of it, so I'd reduce the size); namely, my Acrobat bugs up halfway through and can't complete the process, claiming that the file size is too big. (I don't remember the exact wording now, but I can check it again.) I'm talking about files 100-200 MB in size, so not gigantic, but pretty big. Any idea what I can do there?
|
![]() |
![]() |
![]() |
#5 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,296
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
Quote:
A far simpler procedure, that will result in lossless quality, is to split the document into pieces small enough for Acrobat to handle without hanging. I have had it hang when OCRing large documents. This despite the fact that I use an 8-core Xeon Mac Pro with 16GB of memory. Adobe really doesn't seem to understand resource management. So what I do is simply OCR 100 pages at a time and save incrementally. This works fine for OCR. Unfortunately, there is no way to choose page ranges when removing hidden information. It's all or nothing. So you have to split the PDF into manageable size documents. The best tool I have found for this is cpdf. It is free, multiplatform, and works very well. Assuming you have successfully cropped the entire document and saved it, to split it into say, 100-page documents: Code:
cpdf -split "My cropped PDF.pdf" 1 -chunk 100 -o "@F %%%.pdf" Now you can open each of them in Acrobat and remove the hidden information. After saving all the files, you can merge them: Code:
cpdf -merge "My cropped PDF 001.pdf" "My cropped PDF 002.pdf" "My cropped PDF 003.pdf" -o "My cropped PDF clean.pdf" You can change the argument to -chunk to whatever number of pages is necessary to allow Acrobat to successfully remove the data and save it. You will have to experiment. |
|
![]() |
![]() |
![]() |
#7 |
Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3
Karma: 14228
Join Date: Oct 2015
Device: none
|
Yes, that does sound better, since I have never used GS. However, I have a problem with a small document now. It was 13,5 MB, but there were some cropped pages (it's only 14 pages long, but it's a colour scan), so I thought it could be even smaller, cropped the pages a bit more and told my dear Acrobat (X Pro, if it's relevant) to remove hidden information. Just the cropped stuff, without touching the metadata and links.
When I saved the file... it was... 163 MB. ![]() What happened to increase its size over 10 times?! Do you have any idea what I could do? I still have the original 13,5 MB file, so we can experiment. I've tried removing hidden information again (without the extra cropping, which were just edges of pages and so on), and it got larger again - only 26,2 MB this time, but the thing that should've reduced its size still doubled it. And the strangest thing is that the same Acrobat did manage to reduce size by removing hidden info from some other files, so I'm completely at a loss here. ![]() |
![]() |
![]() |
![]() |
#8 | ||
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,296
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
[Edit, 9 Nov 2015: My statement above is only true if the source PDF uses lossless encodings for the internal bitmaps. See rest of thread.] Quote:
Last edited by willus; 11-09-2015 at 08:10 AM. |
||
![]() |
![]() |
![]() |
#9 | ||
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#10 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
Quote:
|
|
![]() |
![]() |
![]() |
#11 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,296
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
This program (cpdf) is awesome. It blows away the java-based equivalents on my Windows-based PC: 20x faster than pdfsam and over 100x faster than jpdftweak. And it easily handles PDFs with thousands of pages. No more java-based PDF tools for me. I hope to post a more complete benchmark at some point. |
|
![]() |
![]() |
![]() |
#12 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
Quote:
Last edited by PHC; 11-07-2015 at 09:12 PM. |
|
![]() |
![]() |
![]() |
#13 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
It really is. It can do a lot and gives lossless results, preserving the TOC and highlighting. Another very useful free tool is PDFtk - The PDF Toolkit. It excels at extracting pages from a PDF, especially if you want pages that are not necessarily contiguous. Acrobat really sucks at that. You can make a list of the pages and page ranges you want, in any order, and extract them to a new PDF in just one line of code. Of course, cpdf can do that too. Another thing both tools can do is add a TOC (bookmarks outline) to a document. The syntax for cpdf is much simpler so that is the best tool for the job. You can completely index a document by simply navigating to a page, copying the text you want as the bookmark title, typing the indentation level, title, and page of the bookmarks in a text file and adding them to the PDF. In Acrobat you have to go to the page, highlight the text, add the bookmark, and adjust the level using the mouse. It takes much longer.
|
![]() |
![]() |
![]() |
#14 | |
Fuzzball, the purple cat
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,296
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
![]() |
![]() |
![]() |
#15 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21
Karma: 15000
Join Date: Feb 2014
Device: iPhone, iPad, Macbook Pro, Mac Pro
|
Quote:
|
|
![]() |
![]() |
![]() |
Tags |
acrobat, crop, cropping pdf, pdf |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Adobe Acrobat X Pro | pavlli | 4 | 05-13-2011 03:16 AM | |
Opticbook 3600 pro or standard when using acrobat pro? | circularforward | Workshop | 2 | 01-29-2010 03:05 AM |
Kindle DX and Acrobat Pro Crop Box | davidspitzer | Amazon Kindle | 4 | 06-15-2009 12:16 PM |
acrobat pro 8.0 on the PRS-500 reader | ambertape | Sony Reader | 0 | 01-21-2008 12:01 PM |
Confused with Acrobat Pro and Cropping | jmdor | Sony Reader | 6 | 03-06-2007 10:44 PM |