BlackVoid, the problem is that the formatting in the .pdf is encoded as positional informatin (place this character in this font at these x,y coordinates), so one needs to analyse that so as to determine where paragraphs begin / end &c.
Marcel Weiher wrote a utility, TextLightning for Mac OS X (ob. discl. it's shareware and I was a beta tester) and there're other tools which do this, and there are a few others, e.g., SolidPDF for Windows.
--- unless it's a ``tagged'' .pdf where such is embedded in the file structure.
William
|