Hi,
WRT the need of not nesting spans (as discussed
here), I have taken a look at the code that adds the KoboSpans and, IIUIC, it loops through the nodes of each (x)html file in the epub and, when it finds a text node (i.e. a node made only of text, without tags), it splits the text on sentence boundaries and replace the textonly node with a subtree of (possibly nested) spans.
If this is the case, wouldn't it be easier to add the spans as texts? What I mean is that, when you find a texonly node, you could split it in sentences and then put it back together enclosing each sentence in strings like '<span class="..." id="...>' and '</span>'. The resulting string would then be passed up as a textonly node.
This, if supported by the etree library, would be a lot easier to implement and would need very few special cases and exceptions...
Please bear with me if I said something stupid, I'm not comfortable with Python and Calibre plugin development (that's why I am not tring this suggestion by myself

)