Quote:
Originally Posted by HarryT
If you look around, estral, you'll find innumerable discussions of this subject.
Basically, a PDF file is not a "book"; it's just a series of page images. It does not contain "text" - it knows nothing about paragraphs, lines, or even words. All that is in a PDF file is a series of instructions of the form "draw this shape at these coordinates".
The result of this is that a PDF file can rarely be converted into any other format very satisfactorily. All the information about the original "document" was "thrown away" during the process of creating the PDF.
|
This is just not true, all that information (paragraphs, lines and words) are there to see. Even your brain can identify them. A good software can identify them also.
EG: BOOKDESIGNER.