![]() |
#46 | |||||||||||||||||
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
Quote:
Quote:
Quote:
Quote:
Indeed, because it is always legally possible to provide binaries for any system, free or non-free. As said, it is a question of if it makes sense. First, if it is technically possible to provide binaries, and second, if ReactOS is considered to be a target platform which should be supported (but still, if the answer is “no”, other people are entitled to provide binaries for it). Quote:
As for the description of your workflow, it looks like you're doing semantic markup by replacing direct formatting with CSS classes, and after you've done all the polishing of the input, my main interest is for automating step 10, and also extend step 10 by creating automatically a PDF from the very same polished XHTML input. This isn't difficult, it needs just a little time to implement it. For the steps 1-9 (starting with any number between 1-9), I initially asked with this thread, if Sigil could be used to make these steps easier by applying semantic markup to an input file, and if a word processor or a writing software would provide semantic markup initially, some of the steps 1-9 could be eliminated in the first place, if an author uses that software (if he doesn't, he has to pay you, I suppose). You yourself describe the issue of "MSONormal center bold" and "MSONormal 18pt Bold", which I don't like and for which I'm looking for a solution. That's the fault of the word processor to allow such direct formatting, which can't be processed automatically and is therefore less usable as an output with semantic markup. If the exported output is incomplete, that's pretty bad for the software that created the output, especially if the information is present if proprietary formats are used. Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
No, it's because of ethics, law and technological usability. Quote:
Quote:
Quote:
Thanks for your description of your workflow! Since you work for a small press “traditional” publisher, it seems you are already in advantage. More cleaner manuscripts, long-term relationship to the authors (publishing more than one book I assume, since the first book won't make money for the publisher usually) etc. So I cited the “Not. Going. To. Happen.” since it indeed might be an issue that can never be solved. On the other hand, if you get manuscripts in Microsoft Word which already contain direct formatting, I wonder if it would be possible for the second book and following ones to introduce the “collaborative, web-based process” or a semantic writing software to your customers (and giving them a lower price or some other benefit for the time they saved you) since they would just click for a style like they now click for a direct formatting (however, styles still should be displayed as true or approximate WYSIWYG), and while I can't imagine to write texts initially into an online form (but could be used to paste text into it and apply formatting there instead of plain text with just some hints for the formatter), as no advanced Word features are used, a different software could be used to write into (maybe a modified version of LibreOffice). Is it that the users think that they can't access their texts in the future if it isn't saved in Word format? Is there a mental detachment that texts not saved in Word can't look good on a screen or won't be nicely printable on the printer at home? Such consideration would indeed be an unsolvable problem, and I guess that's the real case why we can't have semantic markup and automated processing workflows based on a clean input. If accurate, nobody than word processor developers could solve the issue, and since they're not going to do so, automated processing workflows and front ends to feed them will remain a benefit for professional users only, even if they're easy to set up and commonly available to everybody. Quote:
For print, I assume you use the InDesign PDF export after you've imported (copied?) it from the Word source, right?. My workflow would look like doing the clean-up in LibreOffice, export to semantic XHTML, generate EPUB and PDF from the XHTML automatically. Maybe the gain in time and manual labour is too less to consider an automated processing workflow for it, since the InDesign file is already used to generate both, PDF and EPUB (at the expense of the need for InDesign). Quote:
Quote:
In regard of your last posts, they helped me a lot to more precisely understand the users point of view, since I just came to the idea that writers might be simply “used to Microsoft Word”, which seems to do what they want (which is a bad trick of the word processors), and since writers have out-sourced the e-book and print preparation to you guys, they never get to see the implications and consequences of their decisions (and only might wonder about the fee for something looking fairly simple). Not only is Sigil a tool for the back end (not writing in it directly, applying template styles for semantic markup unlikely in Sigil), also there's no way to create an alternative, because the user can't associate the benefits for using it (except you provide him a lower price or additional results or service, if clean input is provided). To develop a completely integrated automated processing workflow from the writing front end to the output in various formats as a single application, is a very complex task, so it's probably a better idea to work on specialized processing workflows. I might get in touch with the LibreOffice people, if there could be a mode introduced without direct formatting, but as with margins, this could be considered as contra-WYSIWYG and therefore not of relevance. In any case, there are still lots of uses for automated processing workflows, there might just be no common solution for writers of ordinary texts, since this opportunity is prevented by technological decisions made a long time ago and supported up to the present day. Thanks for your patience, hints and insights, it at least saved me from considering wasting a lot of time on Sigil development for features which are urgently needed in other pieces of software. |
|||||||||||||||||
![]() |
![]() |
![]() |
#47 |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Very quickly:
I think, to speak to solely one thing, you missed the funny in my telling of the lady's reply about the broken paragraphs; when she said "I've just looked and took the first few pages, the Prologue, and justified the margins. This corrects most of the broken paragraphs in one move. I'll do it chapter by chapter as I go," she wasn't actually setting margins, as the goal; she thought she was FIXING broken paragraphs, because the ragged line endings she could "see" with her naked eyes magically moved to the right margin (when she chose "justified"). I'd been trying to get her to fix the pilcrows (paragraph codes) appearing mid-sentence and mid-paragraph that is the inevitable output of AbbyyFineREader THAT was the funny. Not the typographic aspects. That would have to be fixed (broken paras) whether in markup, markdown, Word, OO, Latex or Bob's Big ePUB-Baker. ;-) Hitch |
![]() |
![]() |
Advert | |
|
![]() |
#48 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I will only respond to two parts of your dialogue, since I just don't agree with you. All noble initiatives aside, this is a lost cause in my opinion. To each its own I guess...
Quote:
Quote:
The old format was a closed proprietary format. The new one isn't and can be used by anyone if they want, regardless of the OS. The reason I created this, is that I want another type of output not suited for a wordprocessor. Actually, a lot of information very relevant to a wordprocessor is thrown out the window since it has no use in an e-book. So, that is not a good output for a wordprocessor. Without a doubt someone can come up with a document that my add-in chokes upon. I am always looking for documents that can actually that, since it helps me to improve the process. Still, the add-in was initially created to help my own process and for some others. That grew to support more, but in the end I still make the decisions. I can say that if I do not longer support the add-in, it will be open-source. Untill then, I keep it closed source for various reasons. |
||
![]() |
![]() |
![]() |
#49 | ||||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,735
Karma: 24031401
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Sigil is one of the exceptions, because, IMHO, there's no commercial product with a similar feature set for advanced users. Quote:
Programmers who claim to provide free software, which actually only 10% of end users can use, are effectively discriminating against 90% of all other users. Saying that the other 90% are free to build the software themselves is akin to saying that millionaires are free to sleep under bridges. It would be more accurate to define such software as "free software for Linux/Unix users and experienced programmers only." If you want to provide a useful software product that helps both authors and ebook producers, you'll have to get off your high horse and drop your very unproductive holier-than-thou attitude, because it'll get you nowhere. |
||||
![]() |
![]() |
![]() |
#50 | ||||||
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
Quote:
Quote:
Quote:
Quote:
Quote:
Quote:
I don't mind if you think of me like this, but I wonder how you can think that the concept of free software could be "unproductive" and get me "nowhere". In general, there can't be any doubt that such impression is false, and for this specific case, I've already proofed that I can implement the backend, and I'm at the moment working on generalizing it, so the missing front-end is the topic of this thread - even if I do nothing more on the topic, for myself the LibreOffice XHTML output is quite sufficient, but I'm looking for a solution which also other people could use, which I could give to them in order to write in that application initially or later copy the plain text into it in order to do the semantic markup with it. Last edited by skreutzer; 01-25-2014 at 08:30 AM. |
||||||
![]() |
![]() |
Advert | |
|
![]() |
#51 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,635
Karma: 204624552
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
In short: less talkey more codey. |
|
![]() |
![]() |
![]() |
#52 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Hi
@skreutzer I am a Linux user, and I use mostly free software to produce EPUB (but not PDF...). From LibreOffice I use a converter from odt to EPUB which produces very usable xhtml files (and a very correct EPUB2). I use Sigil to add any other features I wish. The converter is writer2xhtml. It works well with OpenOffice, but you need a patched version for LibreOffice 4 and beyond because of a minor Java problem. This software has not been developped for nearly 20 months and, as hard as I can, I got no news from its developper, Henrik Just, which is also the author of writer2latex. He just seems to have disappeared into thin air. I know nothing about semantics, but maybe you could have a look at his efficient code (LGPL licence) to see what could be salvaged for further use. Last edited by roger64; 01-26-2014 at 10:07 AM. |
![]() |
![]() |
![]() |
#53 |
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
Thanks for your hint! Unfortunately, I think writer2epub and writer2latex aren't automatable, and for my purposes I would also like to be able to intervene intermediate processing results to do custom adjustments to it. Since I've already implemented XML to EPUB2/EPUB3/LaTeX/XSL-FO converters, those transformations aren't rocket science to me. However, do you know of some special features of writer2epub and/or writer2latex which you think I could overtake from those tools? Do you want to produce PDF via LaTeX while writer2latex isn't working yet? Do you need any bug fixing for those tools or do they just work fine? As for writer2epub, the EPUB output could also be used as input for an automated processing workflow, so in general it could make sense to improve and to support writer2epub. The only disadvantage of writer2latex is that LaTeX isn't XML, so I would consider it as a target output format, not as an intermediate format for automated processing workflows. Besides of that, I still appreciate such a tool - but I guess OpenOffice/LibreOffice PDF output is already fairly nice, isn't it?
I'm also confident that the code of those tools handle much more than my simple basic first version of html2epub, because those tools are quite around for some time and I just developed in less than a month in spare time, so probably there's lots of things which I could look for solutions in the code of those tools. However, I just implemented what I needed for my own purposes, those tools probably aim for most complete support of ODT features. I'm not sure if I should have the same goal right from the start, as there is much of other things which need work, too. But over time, more features may be added, maybe based upon writer2epub and writer2latex solutions. Edit: Actually, I guess writer2latex is strongly influenced by the direct formatting issue, so writer2latex has to implement a LaTeX replacement for whatever may occur in ODT, and might still not be able to produce an identical result (but maybe is able). Semantic markup would simplify the case, so that OpenOffice/LibreOffice would be just the tool to write in and to apply style templates, and print typesetting by LaTeX would be an entirely separated step from it, resulting in a different visual appearance as the OpenOffice/LibreOffice WYSIWYG. Last edited by skreutzer; 01-26-2014 at 12:01 PM. |
![]() |
![]() |
![]() |
#54 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Please pay attention to this: I do not use writer2epub but writer2xhtml written by Henrik Just. writer2xhtml is an extension of OpenOffice/LibreOffice. Unhappily I do not know writer2latex.
I can give to writer2xhtml any odt file to convert and it manages to export to a clean EPUB2 that I can later perfect if need be. Among the nice features I use often are its ability to append an external css stylesheet and embedded fonts, and its convenient use of style-mapping. The preference panel offers a broad choice of options, among them applying manual formatting (bad), or only style formatting (good). You can also fine tune the splitting of your document and many other things. I have been using it for the last three years. For PDF export, LibreOffice and OpenOffice are quite good but they export odt files. What interest me is exporting EPUB to PDF, so that I can also benefit from the enhancements made using Sigil. Calibre can do it but I prefer using Prince which I find more precise for this task (Calibre is of course much more versatile and can be used for a gazillion other tasks). Last edited by roger64; 01-27-2014 at 09:28 PM. Reason: PDF |
![]() |
![]() |
![]() |
#55 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 698
Karma: 150000
Join Date: Feb 2010
Device: none
|
Quote:
Using it stand-alone is very convenient, as it can be configured/customized to use our house styles, etc. It seems much faster run from the command line than the add-in used to be. Hmmm, now that I think about it, I wonder if it could be configured to convert the "direct formatting" to our house character styles? I'll have to look into that. Completely automating that process would be playing with fire, though. Some of the conversions need the input from the "Mark-1 eyeball." Albert |
|
![]() |
![]() |
![]() |
#56 |
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
@roger64: Oh, sorry, it is my fault! I just misread writer2xhtml as writer2epub, since I've read about the latter one before and writer2xhtml didn't came to my mind as a separate tool because OpenOffice/LibreOffice has already a XHTML export integrated in the GUI. I looked a little bit into the description of those tools, and fortunately they're written in Java (portable, widely installed), LGPLed and usable as command line tools, so they would be automatable. Regarding your description of the features, those tools could be the missing link between LibreOffice as front end for semantic markup and an automated processing workflow as backend, with the opportunity of highly customizable intervention at each processing step.
As you might probably know, EPUB isn't much more than zipped XHTMLs, so by unzipping an EPUB, you could use the files directly for a xhtml2latex or xhtml2fo tool I'm going to develop in the future. Still, the XHTML in the EPUB needs to be semantic (where Sigil may come back into the picuture...). @st_albert: Full automatization could only be achieved if LibreOffice would enforce semantic markup and prevent direct formatting (probably as a special mode). All you would have to do in such a case would be to match your house styles to the styles found in the ODT. If possible, you could export your house styles to the writer, he would import them, so there wouldn't even be a need to match styles - still, LibreOffice seems to be not capable of such template exporting/importing, and it would only work for authors or content providers who you can convince to use the templates in the first place with LibreOffice. If not, you or a formatter may use LibreOffice to apply your house styles to the text written in another writing tool, word processor or text editor. Now I'll go and try the standalone tools for myself, so that they probably can be integrated into a fully automated processing workflow. Without question, such an implementation would need quite some time, but if it gets solved once (and over time, I'm confident it will), it can be used by everyone since it is free software, and also extended and improved for all kinds of special purposes and customizations. |
![]() |
![]() |
![]() |
#57 |
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
I just successfully built the writer2latex package from the sources (latest version of the public repository) on gNewSense 3.0 with OpenJDK 1.6 for OpenOffice 3.2.1 and did a first conversion from ODT to valid XHTML. The result looks slightly cleaner than the adjustments I did by hand in my demo video (I used the same input ODT file). So there isn't much writer2latex saved me, but I guess now I may benefit from all of the command line options, so that the conversion results might directly feed into an automated processing workflow, where I will now also try ODT to EPUB and LaTeX conversion. For the described operating system environment, I probably may provide some kind of support, also small fixes, if needed. However, it would be most interesting to find out on how dependent the writer2latex is on OpenOffice libraries and API - hopefully not much, so it could even be modified to take other input files than ODT, like XHTML or custom XML, too. For the output, the writer2latex package tries to represent the visual appearance of the input ODT file as closely as possible, which quite isn't the best concept for automated processing, since there may be several output formats where there's no sensible way to represent the OpenOffice WYSIWYG appearance even with approximately similarity. Additionally, in order to be flexible, it is required to be able to replace portions of the data with custom content or styles. It might be too early to estimate how useful the package is and if/how it could be changed, but it already works out of the box, and I might use it to build a fully automated workflow in order to have a first brief demonstration for real-world application.
In case such a solution could be interesting for some of you, I guess it would be better to fork this conversation about writer2latex specific issues and updates on my experiments with it. Please note that I'll refer to the standalone tools of the writer2latex package instead of the OpenOffice extension, since manually clicking on things isn't a solution for processing in bulk, and the need of starting OpenOffice is already eliminated if an ODT file is provided. Update: Obviously, the writer2latex package makes extensive usage of configuration files, which is ideal for automated processing. The user manual describes almost all the features which I would want for the task, so my own solution would have looked like quite similar (but probably without the goal to preserve the OpenOffice WYSIWYG as close as possible), and by integrating writer2latex development time from 2002-2012 might be saved. There's still the issue of dependency on OpenOffice API libraries, which might be OK or solved by replacing the dependencies with ordinary ODT XML reading. The documentation doesn't mention EPUB output, but the XHTML conversion allows the removal of direct formatting, style name matching and the insertion of custom stylesheet references. Even if EPUB conversion doesn't support such features, the XHTML output itself would be sufficient to be integrated to a good EPUB, or the EPUB converter might be extended to provide the same features. If for the input side the dependency on OpenOffice could be removed from the standalone tool, and also (for instance) XHTML or plain text added as input formats (or by generating an ODT from XHTML, plain text, RTF as first step of the processing workflow), the package would be an invaluable part of automated XML document processing. Missing output formats and customization features could be added, predesigned configuration files provided, and GUI tools developed in order to enable ordinary authors to edit the configuration files. The future efforts might be to limit the things the writer2latex package tries to do itself, and to combine it with specialized transformations which will take care of complex processing steps in a more readable and customizable way. I'll continue to investigate. Last edited by skreutzer; 01-27-2014 at 06:32 PM. |
![]() |
![]() |
![]() |
#58 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
![]() Anyway, I had published it to on MR too and so you will find the links and the explanations for the patched version here: https://www.mobileread.com/forums/sho...&postcount=224 Hope it works for you. Last edited by roger64; 01-27-2014 at 09:00 PM. |
|
![]() |
![]() |
![]() |
#59 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,625
Karma: 3120635
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
Maybe some MR expert could advise you about where to create such a specific topic on MR? ![]() |
|
![]() |
![]() |
![]() |
#60 |
Software Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 190
Karma: 89000
Join Date: Jan 2014
Location: Germany
Device: PocketBook Touch Lux 3
|
Seems like writer2latex is pretty well programmed in terms of architecture - unfortunately, it seems like the basic concept of the entire package is to perform ODT to XYZ conversion as an inseparable step, which is configurable, but heavily relying on OpenOffice. I started to try to detach the EPUB generator from the package to some extend in order to get a standalone XHTML to EPUB converter, which is difficult because the EPUB packer only takes XHTML as result of an ODT conversion. So without further modifications, writer2latex's EPUB generator can't be used to take any XHTML export from any writing software, except if a writing software would be able to export to ODT in the first place or XHTML to ODT is performed as the first step. The benefit of writer2latex's concept is that configuration files probably won't need to be interpreted and applied several times but only once, but probably the configuration is different for each conversion anyway. I don't know if it is worth the time to invest for a standalone XHTML to EPUB converter which would just resemble what my html2epub tool does already, so I may abandon this first try and change to LaTeX conversion and use ODT to XHTML of writer2latex as part of an automated processing workflow, which then could be the input for my html2epub tool.
|
![]() |
![]() |
![]() |
Tags |
sigil, wysiwym, xml |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Marvin as a cloud front-end | taguntumi | Marvin | 9 | 11-22-2013 08:21 PM |
[Old Thread] Web Front end | DezmondFinney | Development | 24 | 12-18-2012 08:53 AM |
soPDF GUI Front-End | Nathan Campos | 37 | 11-04-2011 07:45 PM | |
Web front end | DezmondFinney | Development | 7 | 08-10-2011 09:51 AM |
Hacking the front-end | DezmondFinney | Development | 18 | 08-05-2011 03:22 AM |