View Single Post
Old 11-02-2010, 11:50 AM   #14
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by janvanmaar View Post
I think my suspicion was correct, the XHTML looks like this:
Code:
<p class="P-kapit">2.</p>                                                                                                       
<p class="P-P32">Name of chapter</p>
So it seems that I cannot grep over multiple lines directly as they are not passed together to the TOC creation engine.
I've seen multiline matching working before, so I'm guessing that the creation engine sees the whole source at once. However, the problem here is that the regex (or XPath, which is what you'd have to use for TOC creation) doesn't match the source because of the tags present. I don't know XPath as well as I do regexes, but I'm guessing that
Code:
//h:p[re.test(., "[0-9]+\.",)]//h:p[re.test(., "[a-z ]+", "i")]
should do the trick.

I don't know, however, if the matching works at all for multiple tags. Might be worth a try, though.
Manichean is offline   Reply With Quote