View Single Post
Old 02-21-2011, 03:26 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Maybe the tutorial didn't make handling that kind of situation clear, if your chapters use words describing numbers like your or the theducks example then you can use '.*' or '[A-Z' ]+' in the chapter detection xpath.

Alternatively, if all chapters use the exact same heading tag, and that heading isn't used elsewhere, you can just configure Calibre to build the TOC based 'only' on the heading tag itself, regardless of the contents of the tag.

I'll try to make those cases a bit more obvious.

If the chapter headings are all lower case numbers in <p> tags you're sort of out of luck, e.g.:
<p>Seventeen</p>

There isn't any good pattern there except for the fact that it's a short word/phrase without puncuation. When all else fails Heuristics will actually look for points like that and add page breaks, but it won't wrap them in <h2> tags because of a higher chance of false positives.

However if it's like this:
<h3>Seventeen</h3>

Then you can just use '.*' as you're regex, or just bypass regex altogether and just use '//h:h3' in the xpath box.

Last edited by ldolse; 02-21-2011 at 03:33 AM.
ldolse is offline   Reply With Quote