10-17-2010, 05:27 PM   #1
kidblue
Connoisseur

Posts: 79
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
PDF dimension changing - Squeeze 'em!

I'm looking for an easy way to "squeeze" a text-based PDF to a specific width and height while not changing its formatting. This is for screenplays, to fit within a Kindle-sized screen. Most screenplays have much of their text in center-justified (for the "dialogue"), which would not end up "squeezed", but the descriptive text and scene headings would. That's the goal.

Any ideas for an easy, GUI-based walkthrough?

As per fabjous:

Quote:
 Originally Posted by frabjous I know how to do that with the pdfpages package for pdflatex; some samples are attached. This might be kind of involved for someone who doesn't already know LaTeX, and quite a lot of software to install for so simple a task if this is all you were using it for. If you know what you're doing, it could be scripted, though. So, see the samples. Basically, I created stretch.pdf by creating a TeX document with the following code: Code: \documentclass{article} \usepackage{pdfpages} \begin{document} \includepdf[pages=-,width=3in,height=6in,fitpaper]{normal.pdf} \end{document} Put it in the same folder as normal.pdf and ran pdflatex on the code. Obviously you would need to change the width and height to the dimensions of the Kindle screen, which I don't know offhand. A lot of tweaking would have to be done to ensure that things like metadata were preserved. It might also be possible to do something like this with Inkscape or Ghostscript, but again, probably not in a user-friendly, straightforward way. If you really wanted to discuss it further, however, let's create a new thread on the idea, and not hijack the BRISS thread. (I wouldn't recommend adding this feature to BRISS...)

 10-17-2010, 08:23 PM #2
Nexutix

The stretched document doesnt look giving pleasure while we read. Reflow is good option, but alas! It doesn't retain formatting.
 10-17-2010, 11:50 PM #3
kidblue

I'd like to see the stretched concept while retaining the formatting.
10-18-2010, 08:25 AM   #4
frabjous
Wizard

Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
If you're willing to install what you need of a LaTeX distribution, I'll give you as much detailed instructions as I can for using it, at least for using it to do this. I could probably even help in writing a script or batch file that would make it relatively easy -- though since I don't use Windows anymore (--yuck!--) I might need some help in testing it. (A script that would work on linux or mac would be super-easy.)

Another option might be one of the online LaTeX compilers: then you wouldn't need to install anything (though you wouldn't be able to script either). When I get a chance (--a little busy right now--) I'll look at the available ones.

It is also possible to do this just using Ghostscript, but at least the way I know how to do, it would be trickier, since it would require hand-calculating the adjustment amount. (Well, that could probably be programmed as well, but I don't know enough about Windows batch files myself.)

Anywhere, here are that the attachments from the previous message above,
Attached Files
 stretch.pdf (40.6 KB, 189 views) normal.pdf (39.9 KB, 171 views)

 10-18-2010, 01:11 PM #5
kidblue

Sounds like it's good news that I'm a proud Mac user, running OS X (10.6.4, latest Snow Leopard). I'd prefer to avoid installing too much "extra stuff", but I'm open-minded. I'm almost surprised there isn't an easy way to do this on Photoshop, but I guess it's difficult to manipulate the whole "file" as it is "pages" on a PDF. The goal would be any kind of turn-key solution that would take a multi-page, text-based PDF and squeezes it into a set dimension while retaining the formatting of the text. The Kindle screen is 12 cm high by 9 cm wide (4.72"x3.5"). Thanks in advance!
 10-18-2010, 02:15 PM #6
frabjous

Excellent. Well, not really, since Apple is worse than Microsoft in a lot of ways..., but it makes writing a script very easy, since it can be written in bash. But it would help to know what you already have installed. Do you have ghostscript installed? Or calibre? Calibre's command line tools could help with preserving metadata. (If you don't know whether or not you have ghostscript installed, try opening a terminal and typing in gs -v.) I could either write a script that both auto-cropped the PDF, and then stretched it, or one that just stretched it, meaning you would have to use BRISS (or whatever you preferred) beforehand (but you'd also have more flexibility in, e.g., removing headers). Do you have a preference? Anyway, I'll try to work on one when I get a chance.
 10-18-2010, 02:21 PM #7
kidblue

Thanks a bunch. I hope it's something that more than just we will benefit from. It's a negative to Ghostscript but a positive to Calibre, which I use every day. Auto-cropping is hard, because margins can change, although they are constant on most screenplays, which is what this whole project is geared towards. Let's go for a simple stretch and see how we feel Thanks again, it really is awesome to not only get some help, but learn something in the process.
 10-18-2010, 02:27 PM #8
frabjous

I'll try to work on this tonight. Don't have time right now, but one quick thing about the autocropping. The script would auto-detect the margins and remove them, so it doesn't matter whether they change. (In fact, I've already written a script to do that--see here, post #9; the mac instructions would be the same as the linux ones.) However, it would crop it down to the smallest possible rectangle without losing anything on the page; meaning that it wouldn't cut out headers and footers if you wanted to cut out headers and footers.
 10-18-2010, 02:32 PM #9
kidblue

In that case, if an auto-cropping script would just remove white space, then that would awesome. As long as it doesn't make judgement calls and remove text, I don't see anything wrong with it, right?
10-19-2010, 01:43 AM   #10
frabjous
Wizard

Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
All right, here's my first attempt. Since I don't have access to mac, I'll need you to help me test it and tweak it. I should also mention that I'm not a programmer, just a power-user with delusions of minor scripting skills. So, fingers crossed.

I tried to make it work just using Ghostscript without pdfLaTeX, since Ghostscript is about 1/4th the size of a minimum LaTeX installation (and less than 1/50th the size of a full LaTeX insllation). I'd be much more confident of the script if it were written in LaTeX, but let's see if this works first.

Requirements:
• BASH interpreter: this is the default shell language for mac and most linux distros, so if you're using one of those, you should be all set. (I have no clue whether or not this works with Windows w/ Cygwin, etc.)
• calibre -- for its CLI tools
• Ghostscript (for mac installation, see here)

Installing setting up the script:
• Download the file pdfstretch.tar.gz attached below and extract the file pdfstretch.sh, which it contains.
• Save it somewhere; either save it in your search PATH (if you know what means), or in the same folder with the PDFs you want to process.
• Open a terminal, and type in:
Code:
cd "/path/where/you/saved/it/"
replacing "/path/to/where/you/saved/it" with the actual location of the folder where you saved the script.
• Type in:
Code:
chmod a+x pdfstretch.sh
this will make the script executable.

(It's probably possible to do the above through your Finder or File Manager, but I don't know how on a mac. That should work, though.)

Using the script
• Again, open a terminal.
• Navigate to the folder containing the PDF files you want to process.
• Type in:
Code:
./pdfstretch.sh "my-file.pdf"
...replacing "my-file.pdf" with the actual name of the file. (Leave off the ./ at the start if you saved the script in your search path instead of the folder you're in.)

This should process the file and create a new file named "my-file-stretched.pdf".
• To process many files at once, you should be able to do:
Code:
for file in *.pdf ; do pdfstretch.sh "\$file" ; done
and have it process all the PDFs in a given folder.

Caveats:
• Right now only the author, title and language metadata are preserved. This could perhaps be expanded upon.
• It does not try to create 9cm × 12cm, but only files whose aspect ratios are 3:4, or close to 3:4 (and even then it isn't always exactly that, though it should be close).
• I assume that Kindles and similar devices scale PDFs to fit on the screen, so that only the aspect ratio matters, not the actual page size, though I don't have a Kindle to test on. They may look very funny if printed, however. (Well, they look funny anyway.)
• Files are auto-cropped to the minimum region possible without deleting anything besides whitespace. Some pages may end up smaller than others; only the biggest pages will be 3:4. Various e-readers may treat those smaller pages differently than others.
• The squeezing/stretching is constant throughout the document. If you wanted shorter lines not to get squeezed, I think you're out of luck; PDFs do not know of "lines".
• The script has not been extensively tested.

One cool thing about the implementation, however, is that the file is NOT rasterized. Text is deformed, but it's still text, so you can copy and paste, use dictionary functions, search, etc.
Attached Files
 pdfstretch.tar.gz (1.4 KB, 116 views)

Last edited by frabjous; 10-19-2010 at 04:23 PM.

 10-19-2010, 01:54 AM #11
kidblue

First off, this is above and beyond. Thanks so much for the hard-work and the amazingly detailed description of the inner workings. It's helpful to a student like me. Real fellowship. Whoa. Super-cool. Installation seemed to go fine, but execution is off. I've gotten the exact same error on two PDFs: usage: mktemp [-d] [-q] [-t prefix] [-u] template ... mktemp [-d] [-q] [-u] -t prefix mktemp: illegal option -- - usage: mktemp [-d] [-q] [-t prefix] [-u] template ... mktemp [-d] [-q] [-u] -t prefix Reading metadata... ./pdfstretch.sh: line 12: ebook-meta: command not found ./pdfstretch.sh: line 13: ebook-meta: command not found ./pdfstretch.sh: line 14: ebook-meta: command not found Author(s) recognized as . Title recognized as . Language recognized as . Analyzing page geometry... ./pdfstretch.sh: line 21: : No such file or directory sed: -i may not be used with stdin sed: -i may not be used with stdin
10-19-2010, 02:44 AM   #12
frabjous
Wizard

Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Ugh. It looks like mac and linux use different syntax for some of the basic Unix tools. That's very frustrating.

Anyway, I've updated the script in an attempt to solve some of these problems, maybe not all, though.

Quote:
 Originally Posted by kidblue usage: mktemp [-d] [-q] [-t prefix] [-u] template ... mktemp [-d] [-q] [-u] -t prefix mktemp: illegal option -- - usage: mktemp [-d] [-q] [-t prefix] [-u] template ... mktemp [-d] [-q] [-u] -t prefix
I wish I knew where to find the right syntax for the mac version of mktemp! But I've changed the script so it doesn't use the --suffix option any more.

Quote:
Weird. You did say you had calibre installed, right? Can you find a program called "ebook-meta" anywhere on your computer? Mine is at /opt/calibre/ebook-meta. What version of calibre are you using?

Anyway, those errors shouldn't be fatal. They'll just mean that you'll lose the metadata from the file. But it would be nice to figure out what goes wrong.

Quote:
 Author(s) recognized as . Title recognized as . Language recognized as .
Those are just byproducts of not finding ebook-meta.

Quote:
 ./pdfstretch.sh: line 21: : No such file or directory
I hope that's a byprodoct of the other problems, but what happens if you type gs -v from the command line?

Quote:
 sed: -i may not be used with stdin sed: -i may not be used with stdin
Totally mysterious. I wasn't using stdin there. But I've edited the script to remove the -i flags.

The revised script is in the post above. (I swapped out the old attachment.)

Last edited by frabjous; 10-19-2010 at 02:46 AM.

 10-19-2010, 07:46 AM #13
Nexutix

PDF Squeezing will be quit useful for reading on 7" screen of Sony PRS 950SC which is a little high in length to fit regular A4 PDF. Any ideas that this can be utilized for that? And what may be advantage?
10-19-2010, 08:36 AM   #14
frabjous
Wizard

Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Quote:
 Originally Posted by patilsaurabhr PDF Squeezing will be quit useful for reading on 7" screen of Sony PRS 950SC which is a little high in length to fit regular A4 PDF. Any ideas that this can be utilized for that? And what may be advantage?
The process should be the same. You'd just need to adjust for the different aspect ratio, which I don't know off-hand. Are you using Windows, however?

10-19-2010, 09:49 AM   #15
Nexutix
with TBR of 500+ !!!

Posts: 578
Karma: 8250074
Join Date: Oct 2010
Device: Infibeam Pi, iPod Touch 4G
Quote:
 Originally Posted by frabjous The process should be the same. You'd just need to adjust for the different aspect ratio, which I don't know off-hand. Are you using Windows, however?
Yes, Windows 7.

The manual of PRS 950SC is of the aspect ratio to fit for the screen. Here is the manual.
They have added the links Contents and Index at bottom 1/5. It's nice to have these links tapped by hand and surfing like web.

So , I thought it would be great for 950SC if we could add index of items we like at bottom part of page. Take a textbook as an example. In a chapter, the links below will be for topics in the chapter. And probably we could come up with a program/script that will convert normally bookmarked documents to such types by attachign extended page cut with links as bottom 5th of the page? That would be nice for those users. (Users with long 7" screen)

Last edited by Nexutix; 10-19-2010 at 09:53 AM. Reason: grammatical

