![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Cleaner Word HTML
Has anyone seen this article? It describes using xml stylesheets to make Word export much cleaner xhtml code. It tried it on a few files and it does seems to work rather nice.
Basically what you do, is save the document as an xml file, but apply a xml stylesheet while saving. The result has the extension xml, but is in fact a much cleaner html file. |
![]() |
![]() |
![]() |
#2 |
Enquiring Mind
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 562
Karma: 42350
Join Date: Aug 2010
Location: London, UK
Device: Kindle 3 (WiFi)
|
Ohhh... that looks like excellent info! Thanks for posting that link, Toxaris. I need to brush up on my very rusty XSL and play around with this. I kept having vague thoughts re the fact that the native .docx format used by MS Word 2007 and later is actually a compilation of XML data might offer the possibility of creating an XSL transformation to translate it into clean XHTML. But so far have never quite got around to investigating that. This provides an excellent starting point!
![]() |
![]() |
![]() |
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Best part is, some example stylesheets are given! So you can try right away...
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Disadvanges of Microsoft Word HTML for an ebook? | purcelljf | Workshop | 1 | 08-31-2010 09:46 PM |
How to copy pdf to word and html from firefox reader? | Edd666666 | 2 | 01-25-2010 05:14 PM | |
Word to HTML no such file or directory | evwool | Reading and Management | 1 | 05-11-2009 01:50 AM |
Plucker Fails to convert HTML docs via Word | evwool | Reading and Management | 8 | 05-10-2009 01:23 PM |
Docvert 2.0 converts MS Word files to clean HTML | Alexander Turcic | Lounge | 0 | 03-16-2006 04:50 AM |