Quote:
Originally Posted by Humble
I have read quite a few threads before posting but they do not help me. I am trying to create a table of contents with my books. Can someone how to explain this is in layman's terms. I went to the Xpath tutorial and I don't understand all that stuff. Can anyone clarify in the simplest way to get table of contents in my books?
|
The default (
Structure Detection) is:
Code:
//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+', 'i')) or @class = 'chapter']
What it means is that calibre will assume chapters start at either <h1> or <h2> tags that have any of the words (chapter, book, section or part) in them (in any mixture of upper and lower case) or that have the class=”chapter” attribute.
If you are editing the ebooks, then just put the chapter headings in h1 or h2 tags with Chapter (say) in the heading and/or make the class 'chapter'. Or see below for other XPATH settings you might use.
When generating a TOC for purchased ebooks, I have found that you need different XPATH values for different ebooks.
Versions that select all <h1> and <h2> (and <h3>) tags:
Code:
//*[name()='h1' or name()='h2']
//*[name()='h1' or name()='h2' or name()='h3']
A version like the default that in addition looks for numbers in the tag contents:
Code:
//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+|0|1|2|3|4|5|6|7|8|9', 'i')) or @class = 'chapter']
A version that looks for tag contents which is all capitals (no lowercase):
Code:
//*[((name()='h1' or name()='h2') and re:test(., '^[^a-z]+$')) or @class = 'chapter']
Any element (or just <p> tags) starting with Chapter:
Code:
//*[re:test(., '^chapter ', 'i')]
//h:p[re:test(., '^chapter ', 'i')]
Sometimes I first run once through Calibre (with --pretty-print) and if this does not produce a good TOC I run through again keying on one of Calibre's classes. Often calibre1 is what is needed, or calibre1 with a test like those used above, but unzip the epub and look inside to see what is needed in your case:
Code:
//*[@class = 'calibre1']
//*[@class = 'calibre1' and re:test(., 'chapter|book|section|part\s+|0|1|2|3|4|5|6|7|8|9', 'i')]
//*[@class = 'calibre1' and re:test(., '^[^a-z]+$')]
With any of these, I sometimes need --use-auto-toc. However, --use-auto-toc isn't always good because an existing TOC might be ok.