So you end up with an approximation of the structure of the original document. Now whether your reliability is 80% or 90% or 60% - the simple fact remains that paragraph structure has been lost, from a source document that did originally have it. So you are now required to do a line by line A/B comparison of the Word document and the resulting html to ensure such structure is rectified. Which is no different to any other PDF conversion tool out there. Sure your algorithms might be better than others, but unless it is 100% retaining paragraph structure there is always that element of additional review/editing of every page that is now required.
I can appreciate that there are users out there for whom "close enough is good enough" but I am not among them I'm afraid.
|