There's a XPath tutorial in
the usual place.
Did you
try using \d+ instead of \d? Looking at what you copied as a chapter heading
Code:
<p class="MsoPlainText"><span>CHAPTER 2</span></p>
I suspect that something like
Code:
//h:p/h:span[re:test(.,'CHAPTER \d+','')]
ought to work. If that still generates four entries per chapter, I'd have a look at the source code if I were you. In that case, I'd suspect there really
are four chapter headings hidden somewhere (think inline TOC if there is one...).