View Single Post
Old 01-11-2011, 04:55 PM   #26
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by ldolse View Post
Preprocess will go through your document and look for common chapter headings using a heuristic type method - it should be able to mark up simple numeric headings like the ones you're listing in your code. If your doc already uses H1, h2, h3 tags, etc then the heuristic processor disables itself - you just need to look at your code and write the correct xpath.

If the preprocess stage finds a chapter header during its search it wraps the headings in <h2> tags. It wraps subtitles if they exist in <h3> tags.

....
yes, that often works but not always - it's either because a book has a chapters within parts / sections structure or because the book is littered with span tags, in the same html line that contains the chapter numbers - i am not yet sure which.

reading your explanation again, maybe the logic engine sees SOME h2 tags -say on the section headers, & disables itself before the chapter numbers are processed ?

PS thanks for explaining how the preprocess & xpath steps interact.

Last edited by cybmole; 01-11-2011 at 04:58 PM.
cybmole is offline   Reply With Quote