The PDF format just doesn't preserve enough information about the structure of the document to allow further conversions. I like to think of it as a "lossy" text format, of sorts.
Trying to extract a properly formatted document from a PDF is akin to hoping to recover a full-sized image by "enhancing" a small thumbnail.
|