MobileRead Forums - View Single Post - structure detection

cybmole · 01-11-2011, 04:55 PM

Quote:

Originally Posted by ldolse

Preprocess will go through your document and look for common chapter headings using a heuristic type method - it should be able to mark up simple numeric headings like the ones you're listing in your code. If your doc already uses H1, h2, h3 tags, etc then the heuristic processor disables itself - you just need to look at your code and write the correct xpath.

If the preprocess stage finds a chapter header during its search it wraps the headings in <h2> tags. It wraps subtitles if they exist in <h3> tags.

....

yes, that often works but not always - it's either because a book has a chapters within parts / sections structure or because the book is littered with span tags, in the same html line that contains the chapter numbers - i am not yet sure which.

reading your explanation again, maybe the logic engine sees SOME h2 tags -say on the section headers, & disables itself before the chapter numbers are processed ?

PS thanks for explaining how the preprocess & xpath steps interact.