I didn't realize the C++ portion had been updated and was buggy now - assuming you get past that part you will get perfect xhtml. Basically what poppler does (when it works) is create a nice xhtml file which will render exactly like the original pdf. But aside from rendering perfectly and providing lots of info about the layout it's still no good because it still has all the limitations of pdf. The reflow code then tries to turn all the sentence fragments into paragraphs. So the debug output will actually have the original xhtml from Poppler along with the reflowed version.
Last edited by ldolse; 07-04-2012 at 08:15 AM.
|