View Full Version : PDF Sucks and see here why...


TadW
07-12-2003, 08:33 PM
Two interesting links:

http://www.teleread.org/blog/2003_07_01_archive.html
and
http://www.yarinareth.net/caveatlector/archive/week_2002_10_20.html#e001024

I agree with the writers of these articles. For you, does it also take forever to start Adobe Reader? Are you also tired of having difficulties converting PDF to another format?

macumazahn
05-25-2004, 11:14 PM
Cannot get pdf to convert to txt files properly, I have some old books that are pdfs that I would like to eventually convert to iSilo, Adobe makes a mess out of the file when converting to txt.

Chaos
01-05-2005, 02:17 AM
For you, does it also take forever to start Adobe Reader?

Ooh, I have some suggestions. ;)

Speed up Acrobat 7:
http://blogs.msdn.com/jonathanh/archive/2004/12/22/330288.aspx

Some discussion on speeding up Acrobat 6:
http://blogs.msdn.com/tims/archive/2004/11/24/269567.aspx

A utility to speed up Acrobat (should be top of the list there - or near it):
http://www.tnk-bootblock.co.uk/prods/misc/

Xpdf (another PDF reader - there are more out there):
http://www.foolabs.com/xpdf/



Yes, PDF isn't that great. But there's not much else to fill the space it currently holds (documents, like manuals, with some graphics - not just text, but mainly text). Microsoft Word (.doc)? More properitary than PDF. HTML? Not easy to grab for offline viewing. OpenOffice.org documents? Not supported in "commonly used programs" (MS Word - many open source word processors do support them). Similar problems with alot of other possibilities.

For conversion of PDFs (and PostScripts), have a look at GhostScript.
http://www.cs.wisc.edu/~ghost/

And be sure to look at the manual - a PDF file, downloadable from:
http://www.cs.wisc.edu/~ghost/doc/merz.htm

brahamt
01-05-2005, 09:38 AM
What I would like to do is somehow remove the option of having Acrobat open in the web client (i.e. download rather than open).

Does anyone know how to change it to that behavior?

Alexander Turcic
01-05-2005, 09:50 AM
This will stop you from loading a PDF in your web client (it doesn't stop Adobe from opening it in its seperate process though):
1. Close your browser
2. Open Acrobat Reader
3. Select the "Edit" menu
4. Select "Preferences"
5. In the "Options" part of this dialog box, make sure "Display PDF in Browser" is not selected.

dwig
01-05-2005, 12:14 PM
PDF is both very good and horrid, depending on what you want to do with it. Adobe has _never_ wanted it to be anything other than an final display format; they have never wanted it to be any form of interchange format. They have, over the years since the beta test days (I was prersonally involved for a short time during the later beta cycle of the original release), gradually and grudgingly yielded to users' desires to use it as a data interchange format despite the fact that PDF's core PostScript based architecture makes this _extremely_ difficult.

If you are looking for a final output format to display complex graphic intensive and layout controled documents then PDF is a very good choice and often the best choice. You can easily create an electronic document that looks exactly like the printed document. This is what PDF was designed for.

On the other hand, if you want to port a document to some other format, using a PDF as either source material or as an intermediary format is unwise. It should be used in such a workflow iff there is no other choice and, even then, it should be done with the foreknowledge that it will be a difficult and bumpy road. The basic structural design of PDF make automated conversion tools virtually impossible to design. The degree of artificial intellegence they must posses is beyond practical implimentation.

I've worked as one of the principle software designers on several projects at Macromedia to import and export PDF's in FreeHand. FreeHand's spacial layout orientation made reading and converting PDF's much easier than a linear beginning-to-end orientated word processing or ebook document would. Still, we had great difficulty assembling the various data chunks in the PDF into FreeHand type text blocks and graphic entities. More often than not, it was impossible to create code that could "think" about the page layout and decide which pieces should be assembled in what groupings and in what order.

The newer "tagged" PDF attributes help such importers to keep text in flowable text blocks but some source documents don't lend themselves to good automated construction of tagged PDFs during export and many PDF exporters don't have the option to generate such tagged PDFs. As a result, most PDF's found "in the wild" can't be reliably converted to any linear flowing format without extensive human interaction. Also, Adobe designed this tagging only as a tool for its Reader to use when needing to reflow a document displaying it on devices with limited display real estate. They were not designed to ease conversion to other formats and their design is, as a result, not optimized for such use.

brahamt
01-05-2005, 04:05 PM
Thanks Alex. I tried this and the behaviour is better, but it is still opening the file, although in a seperate Acrobat windows. What I really want is to open a dialog and offer me open or save.


This will stop you from loading a PDF in your web client (it doesn't stop Adobe from opening it in its seperate process though):
1. Close your browser
2. Open Acrobat Reader
3. Select the "Edit" menu
4. Select "Preferences"
5. In the "Options" part of this dialog box, make sure "Display PDF in Browser" is not selected.

Alexander Turcic
01-05-2005, 04:26 PM
Thierry, then you must reassign the file associations for 'PDF'. If you are using Windows, try following these steps:
1. From the Control Panel, click the Folder Options icon.
2. When the Folder Options dialog box appears, select the File Types tab.
3. Choose the file type you want to remove from the Registered File Types list, and then click the Remove button.
4. Click OK.

brahamt
01-06-2005, 09:00 AM
Thanks Alex.

Gatton
01-06-2005, 09:45 AM
PDF is both very good and horrid, depending on what you want to do with it. Adobe has _never_ wanted it to be anything other than an final display format; they have never wanted it to be any form of interchange format.

Thanks for the detailed info dwig. I've always liked PDF when I needed to read something onscreen or print it out. I love that I can download things like forms (especially the myriad government forms) in PDF and print them out without having to get them from the post office or request them.

But conversion to another format such as txt and html has been a nightmare. I think I've tried all the inexpensive and free solutions. I hear the ones that cost an arm and a leg do a better job (Acrobat full version and that Gemini one) but I don't forsee being able to afford those anytime soon.

Maybe it's my fault for trying to make PDF do something that it's not supposed to do.

zanzibarfiction
07-04-2006, 06:25 AM
:deal: Dear All,

PDF is dying, why give it the kiss of life.

How would you like:

- to publish to the web fast without any technical knowledge.
- to protect your document so no one can print it or copy the text to clipboard.
- to publish your document so anyone with just a web browser can view it.
- to publish your document so it looks exactly the same as you see in your Windows application.

The answer is Adobe FlashPaper printer driver software. We sell FlashPaper interfaces which allows you to achieve all of the above. We are the only provider of FlashPaper Plugin. User our FlashPaper plugin and Enforce
your copyright and document format on the internet...fast.

Zanzibar Fiction (TM) - Enforce Your Copyright
http://www.zanzibarfiction.com

SerialAeon
07-04-2006, 01:29 PM
PDF is dying, why give it the kiss of life.


Well, I'm a scientist and believe me, PDF is a very healthy format in this field ;-) Almost all the scientific publications are diffused in this format, as well as all the archives. True, we still lack (1) really usable software to manage these PDF in a iTunes-like database (but real ID3-like tags within PDF files are needed for this, as it's more an art than a science to automatically extract even the title of the papers from the PDF files), and (2) a device to read them (I'd give anything for an A4-sized e-ink based eBook device !). But anyway, PDF are more than healthy for scientists.

Aurelien

zanzibarfiction
07-05-2006, 02:19 PM
Well, I'm a scientist and believe me, PDF is a very healthy format in this field ;-) Almost all the scientific publications are diffused in this format, as well as all the archives. True, we still lack (1) really usable software to manage these PDF in a iTunes-like database (but real ID3-like tags within PDF files are needed for this, as it's more an art than a science to automatically extract even the title of the papers from the PDF files), and (2) a device to read them (I'd give anything for an A4-sized e-ink based eBook device !). But anyway, PDF are more than healthy for scientists.

Aurelien
:scholar: FlashPaper produced files a lot smaller than PDF.

Steven Lyle Jordan
10-12-2006, 04:43 PM
:deal: Dear All,

How would you like:

- to publish to the web fast without any technical knowledge.
- to protect your document so no one can print it or copy the text to clipboard.
- to publish your document so anyone with just a web browser can view it.
- to publish your document so it looks exactly the same as you see in your Windows application.


Um...

You can do all that with PDFs right now.

Trip
01-17-2007, 07:53 AM
Speaking of PDFs and FlashPaper, you should check out the following site:

http://www.scribd.com/

It's a startup that provides free use of Macromedia FlashPaper. It's basically "YouTube for documents" and lets you do all sorts of stuff with PDF, FlashPaper, and other formats

Please let me know what you think! Thanks.

Azayzel
01-25-2007, 05:24 AM
I think that one of the main problems that people have when converting from PDF to any other format happens when a) the fonts aren't embedded into the PDF, or b) the PDF was generated as a raster image.

a) If the fonts were embedded within the PDF, it would be a simple (albeit, expensive) matter of opening the document within Acrobat Pro and copy/pasting the text into something more easily formatted into an eReader-type format. There are even a few free websites available that will convert the PDF to Word or RTF formats for you with a simple upload.

b) The second option for getting the data out of a PDF is a bit more tricky and time-consuming, but is a necessity, especially if the image was rasterized. This involves running OCR on the document, I even think that Adobe has a rudimentary OCR utility built-in to version 7.0. Of course, the better your OCR software, the better result you'll get. You can then copy/paste or output the result to a format more friendly to your needs. One problem you may encounter with this method is if the PDF had any kind of watermark placed within it before being rasterized (if it hasn't been converted, you can simply remove the watermark w/in Acrobat Pro), this can and will cause problems during the OCR process. Fortunately, if the watermark is a different color than the rest of the document; e.g., red, green, blue, etc., you can load the rastered image into Photoshop or some other grpahics utility that lets you separate the colors used in the image, delete the channel with the watermark, and re-flatten the channels back to B/W. If the watermark spans multiple channels, but does not exist on one, you're set; otherwise you'll have to live with it and seek some other type of program that can convert the image to the format you require.

IMO I do not see PDF's leaving the scene for a very long time to come; it is just too universal/cross-platform to disappear and has been essentially adopted by the government (amond others) as the defacto standard for storing documents. Sure, it may have some large file sizes; but you get what you see and there are other apps that output PS and PDF files other than what Adobe has, they just caught on quicker and marketed the project quite well.

Good luck!