View Single Post
Old 05-13-2014, 04:11 AM   #39
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Quote:
Originally Posted by skreutzer View Post
.../... However, I still would like to see that the work put into writer2xhtml is of further use, but it might require a huge commitment to investigate the current code and change it in a way that makes it less dependent and more flexible. Anyway, writer2xhtml could be integrated into such automated processing workflows, I even experimented with the writer2xhtml output initially, but writer2xhtml in itself doesn't change much in terms of the original problem, which is the lack of semantic markup in the source document, so it would still translate to “garbage in, garbage out”, while all direct formatting still would needed to be removed from the writer2xhtml output so that only the raw text and semantical, structural information remains.
If you can do this work, I think it will definitively be worth it. I had the opportunity to discuss with Henrik Just - the writer2xhtml author -. He planned to design a new GUI when all work suddenly stopped. He was also aware of the “garbage in, garbage out” possibility. But, among the many options writer2xhtml provided, there was one that excluded "hard formatting" (see screenshot) - what you call "direct formatting" - which, I think, could still be of further use because what's left for processing looks akin to your "semantic markup". This is the only one I still use today.

I also have a messy, uncomplete, but for me useful, .ott file that I use consistently. I also export a custom stylesheet which contains the font-face declarations, and some other style definitions. Either, I use them when I finetune my EPUB or I just discard them.
Attached Thumbnails
Click image for larger version

Name:	formatting.png
Views:	458
Size:	52.8 KB
ID:	122909  

Last edited by roger64; 05-13-2014 at 04:30 AM.
roger64 is offline   Reply With Quote