
One of the problems e-book fans encounter on a daily basis is the conversion of messy e-book formats (such as MS Word or Adobe PDF) into clean formats (such as HTML or XML).
Docvert 2.0 builds upon desktop word processors such as OpenOffice and Abiword to deal with the vagaries of the Word documents.
The resulting OpenDocument XML is then optionally converted to HTML or any XML. This is done with XML Pipelines, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (e.g., chapter1.html, chapter2.html…).
Docvert is a server-side application which requires that you've PHP 5.0 and OpenOffice or Abiword installed. As such, it offers a simple REST-style interface that can be used to integrate Docvert for instance into a public Web service.
[via
Lifehacker]