@Kovid: Thanks for clarification (and, of course, for the nice piece of SW!)
@Manichean: The <p> tags are not unique. Clearly, I can do the chapter numbers - the very first regexp in this thread does that reliably.
I think I can do better than that though by processing of the XHTML by a simple python script that will first do the regexping of chapter numbers and then merging the matching tag with the subsequent tag.
A better possibility would be to construct hidden toc but from this thread
https://www.mobileread.com/forums/sho...d.php?t=105019 it seems that this is broken at the moment for mobi file output.
Unfortunately, no nice generic solution seems to be available though, as far as I can see.