Quote:
Originally Posted by skreutzer
.../... However, I still would like to see that the work put into writer2xhtml is of further use, but it might require a huge commitment to investigate the current code and change it in a way that makes it less dependent and more flexible. Anyway, writer2xhtml could be integrated into such automated processing workflows, I even experimented with the writer2xhtml output initially, but writer2xhtml in itself doesn't change much in terms of the original problem, which is the lack of semantic markup in the source document, so it would still translate to “garbage in, garbage out”, while all direct formatting still would needed to be removed from the writer2xhtml output so that only the raw text and semantical, structural information remains.
|
If you can do this work, I think it will definitively be worth it. I had the opportunity to discuss with Henrik Just - the
writer2xhtml author -. He planned to design a new GUI when all work suddenly stopped. He was also aware of the “garbage in, garbage out” possibility. But, among the many options
writer2xhtml provided, there was one that excluded "hard formatting" (see screenshot) - what you call "direct formatting" - which, I think, could still be of further use because what's left for processing looks akin to your "semantic markup". This is the only one I still use today.
I also have a messy, uncomplete, but for me useful,
.ott file that I use consistently. I also export a custom stylesheet which contains the font-face declarations, and some other style definitions. Either, I use them when I finetune my EPUB or I just discard them.