I think the problem is the style of the original and how faithfully one wishes to follow it.
I found the unwrap factor could handle about 50% of the problem and recourse to regular expressions probably got it up to 80-90%. However if you need to be really faithful to the original then unfortunately it needs to be checked by hand and this is where it takes most time.
The advantage of ecub is that it is very flexible and produces good, simple xhtml files which can be edited with ease. Also the problems surrounding the TOC disappear and mobi,epub and voice can all be produced at the same time.
I think that calibre does a very good job but is limited in the degree of accuracy of the output.
|