Quote:
Originally Posted by Xenophon
Many documents are nearly that simple. But PostScript (and its descendent, PDF) are really fully general page description languages. That means, among other things, that any glyph can be placed at any location on the page, in any size, rotation, thickness, etc. And there's a very rich language for computing those locations. In PostScript, it's turing-complete (that is, you can use it to compute anything you can compute in any ordinary programming language). I'm not sure whether PDF is quite that complex. In any case, the simplest PDF files might not be difficult to de-tag and reflow, but add in even a little of that complexity and it gets a whole lot more difficult.
PDF really wasn't originally designed for reflow. It's really a page description language. Period.
Xenophon
P.S. I invite more knowledgeable geeky types to correct any mistakes I've made in this explanation.
|
Thanks for the additional info.
I went on the ADE forum to ask the question there, and here is the answer I got:
It's not an issue with the 505, this is expected behavior. When you zoom in on PDF, to contents are "reflowed" - essentially stripping a lot of the formating so we can enlarge the font sizes. Because of technical limitations with PDF files (and the current implementation of the reflow algorithms), we do not reflow across pages, so you will get gaps between pages.
Also because of limitations with with the PDF file structure itself (it is not an easily reflowable content), line breaks will appear in odd places.
The bits I find interesting:
- They talk of the "current implementation" as being a cause, so there might be hope for a brighter future.
- They say "we", but what does it mean? I assume it's the PRS software that reflows text, not ADE?
I completely understand PDF weren't originally designed for reflow, but I still find it amazing how difficult it is, sometimes impossible, to get a proper html doc from a PDF. Even Acrobat pro can't get it right 90% of the time...