@user_none - if I could ask a question here. I've never looked at what a PDF structure looks like internally so have no full appreciation of the difficulties it causes. However one thing I have noted that the conversion *always* gets wrong is when a sentence in an indented paragraph starts at the leftmost column.
Code:
Some first line.
My second line.
Will always become two paragaphs when converted.
Out of technical curiosity and ignorance what is the issue with detecting this? And does the new PDF engine (which I know is on hold) address this?