Looks like this text part reordering and/or splitting is the biggest remaining hurdle to overcome.
1. Splitting
Splitting must be done at some time. It can be done
- within the original odt file,
- with your extension provided the file has a proper structure
- if the user forgets or neglects to do it, Sigil can help correct it later.
I had to use Sigil for example because my original odt text had no structure. It was my mistake. It was not your extension's fault but somewhere the splitting had to be done.
There is one OpenOffice extension which can nicely split a text, if it can be of any help: it's called writer2xhtml (last version v 1.0.2). It can mechanically split a text into workable chunks if you select this option. Maybe you could have a look at how it's done because if your goal is to process a file without Sigils' help, you will have to give your extension this capability .
2. Reordering
This is the most puzzling behaviour on my Linux box. I have no clues up to now. Only the fact that it does happen a mixup in the text chunks. I would suggest implementing a supplementary check within your macro with a kind of reordering mechanism. I do not know though if it is doable.
3. Metadata
There is a small problem with OpenOffice. The author's name is already specified even if you process a Dickens' novel. The language is also frequently forgotten or wrong.
I think it's not a big problem just to check the output on Sigil.
Last edited by roger64; 06-28-2010 at 05:59 AM.
|