|03-16-2006, 05:50 AM||#1|
Join Date: Oct 2002
Device: Too many to count here.
Docvert 2.0 converts MS Word files to clean HTML
One of the problems e-book fans encounter on a daily basis is the conversion of messy e-book formats (such as MS Word or Adobe PDF) into clean formats (such as HTML or XML). Docvert 2.0 builds upon desktop word processors such as OpenOffice and Abiword to deal with the vagaries of the Word documents.
The resulting OpenDocument XML is then optionally converted to HTML or any XML. This is done with XML Pipelines, an approach that supports XSLT, breaking up content over headings or sections, and saving those results to multiple files (e.g., chapter1.html, chapter2.html…).
Docvert is a server-side application which requires that you've PHP 5.0 and OpenOffice or Abiword installed. As such, it offers a simple REST-style interface that can be used to integrate Docvert for instance into a public Web service.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Will Calibre maintain the links when it converts HTML?||ficbot||Calibre||3||11-18-2010 11:27 PM|
|MS Word "crap" at beginning of html files||PatNY||Sigil||23||10-21-2010 07:22 PM|
|Clean and compress HTML before making ebook||eping||Workshop||4||01-13-2010 08:51 PM|
|Best way to get clean HTML||JSWolf||Kindle Formats||18||04-02-2009 12:00 PM|
|HTML converts to ZIP?||Deejub44||Calibre||2||01-24-2009 09:57 PM|