View Single Post
Old 05-20-2011, 06:30 PM   #5
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Optimize the PDF file

4. Optimize the PDF file

To optimize a PDF file you need a PDF editor. I do everything with Adobe Acrobat or Bluebeam PDF Revu. Both are very powerful PDF editors, sadly their price is pretty high, too. Acrobat is the standard! Next to PDF editing and creation features, it has a very powerful OCR engine and lots of features for print production.
PDF Revu on the other hand has the same editing and creation features, but its main use is in working with PDF files on a tablet PC, mobile PC or in the field. If you own a tablet PC, take a look at it, it’s the best PDF tool to annotate with pen and ink and work with PDF documents.
There are also some free PDF editors available which do this, like:
http://sourceforge.net/projects/briss/

General
There are some free tools, written by some users, which shall improve the PDF pages, by splitting and cropping them automatically and changing the font.
Personally, I really don’t like hard splitting a PDF page in two or three parts. By doing this, you won’t be able to view the full single page on the eReader again and you’ll have to create a second version of the PDF file, for the eReader only. Thus, if you work with the PDF file on your computer, too, you’ll create a small organizational mess.
So my preferred method is to fit the page to the eReader screen as good as possible.

Crop pages and change page size:
That’s the most important and most effective way to optimize the PDF file. Remove the white border and maybe change the page size.
The Adobe Reader on the Sony reader has one ideal width height ratio of 0,775. Everything other than that and it will add some grey border around it. So by changing the page size, and resizing the content, you can convert the grey border to white border and thus create more space for ink annotations.

If you crop a book page, remember you have an eReader, so a header with the chapter title or footer with the page number isn’t necessary, so remove it, too, if possible but only if it helps. If the page content is very narrow, so the eReader adds a huge grey margin to left and right, then you have to remove it to reduce this margin. If the content is very wide and the eReader creates a margin on the top and bottom, then removing it won’t improve anything.



In the first two image I’ve opened a PDF file with its default dimensions (A4). We have both a huge white and grey margin, both in portrait and landscape mode (wrong aspect ratio)



In the second two image I’ve cropped it, so every white margin is gone. Because the page size has the wrong aspect ratio (too narrow) the reader still adds a grey margin on the left and right.



And finally I’ve cropped the margin with the correct aspect ratio. Compared to the second picture I don’t gain any additional size increase, but the grey margin is gone. Now I could have left on only one side a large margin, on the other none, so I have some space to add annotations, sacrificing nothing.

So if you crop the pages, try to achieve the correct aspect ratio (and if you want to annotate try to make the margin on one side large, on the other small).

If your page is very wide you'll get following result. It gets displayed properly in landscape mode, but you still have to scroll, so why not extending the bottom to get additional space to take notes on without sacrificing anything again.



The correct aspect ratio is Width/Height=0.775
If the cropped page is 24cm tall, you should try to set a width of 0.775*24cm = 18.6cm
If the cropped page is 20cm width, you should try to set a height of 20cm/0.775 = 25.81cm
(The same in inch or any other unit)

Sometimes the PDF file has the wrong aspect ratio, but there’s no margin to replace the grey one with white one on which you can write. Then you have to change the page size / the content size on the page . A rather difficult task with Acrobat. You have to print the PDF file to a new page size. So print it on a larger page, then crop this page afterwards.

I use Bluebeam PDF Revu for this which supports to resize the content on the page directly (I do most of my PDF work with it, but it’s a rather exotic piece of software, so I don’t go in detail here).

1:1 copy
I haven’t used this yet, but maybe others use a reader for more than just reading and want to view the objects in PDF file with the right size ratio. So a 10cm large house in the PDF file is 10cm large on the eReader display, too.

Then you have to select following page size:

If you want a 1:1 copy in portrait mode set:
Width: 89,4 mm
Height: 115,4 mm

If you want a 1:1 copy in landscape mode set:
Width: 120,4 mm
Height: 155,4 mm

(I got these values and also the aspect ratio by measuring it. You can’t just take the display measurements, because the eReader adds a little margin around every PDF file. Maybe there’s a way to receive the internal saved values but I don’t know this way)

OCR
If you digitize a book then don’t leave the pages as images. It’s not only a waste of space and increases loading time but it also looks fuzzy and strange on the eReader. Convert it to text with an OCR tool.
Either do it with some specialized OCR software and convert it to an ePub format, or, if it isn’t text only and thus the PDF format necessary, then use the so called ClearScan method in Acrobat.



It doesn’t convert the text to a given font, but creates a custom font based on the scanned image. This allows Acrobat to replace the pixel based image with a vector based freely scalable font, and thus converts everything perfectly.



On the left you see the PDF file as image, on the right it got converted to vector font with ClearScan

The font looks in detail different than the usual font you know, tough it’s the best method to view a scanned book on the eReader, especially if the pages contain more than just flow text.



The upper two images show the PDF file on the eReader whereas the scanned page got converted to PDF without OCR, so it's still a bitmap.
The lower two images show the same PDF file converted with ClearScan.

Last edited by UpSpin; 05-20-2011 at 08:10 PM.
UpSpin is offline   Reply With Quote