I've found that virtually every pdf converter program on the net uses the same open source program pdf2html
http://pdftohtml.sourceforge.net/
It uses ghostscript to extract images, and it operates in one of two modes
1) extract all images and dump them inline to file, without preserving tables.
- Text comes out in paragraphs with random line breaks, and looks very ugly, tables are not preserved.
2) extract each page background as a whole image, and create each page as a table.
- All formatting is preserved.
- HTML document looks almost identical to PDF
Method 2 looks good, but won't work for ebooks because of the static background page size (no reflow)
Method 1 is used instead (but no tables are preserved)
This is the same method that Acrobat 9 uses to export HTML 3.0
Now, if your document has limited tables, and your have a simple PDF with a few columns you want to reflow or change, you can use method 2.
Use
http://pdftohtml.sourceforge.net/ without images enabled.
Then convert the HTML to EPUB with tables enabled. You should get yourself a very respectable document, with intact, flowing/reflowing paragraphs that span multiple pages.