View Full Version : Need help with XPath

04-21-2010, 08:28 PM

I don't knpw if this is possible or not, so I hope you can help me.

Right now I am working on an ebook in html.

The chapter starts with roman numbers and the chapter name. The roman number is in h3, the chapter name in h4.

So it looks like


A new Beginning.

Now, I want to convert it to epub with Calibre. Under chapter recognition, I used //*[re:match(name(), 'h4')] . In the chapter list, I see it as "A new Beginning". Basically, that is exactly what I wanted - in the past.

But now I thought to myself: It would be better, if the roman number would be included.
In the chapter list, it should look like III - A new Beginning
Is that possible ? I thought, something like //*[re:match(name(),'h3' - 'h4')] should work, but so far, the preview in Calibre doesn't show roman numbers.

Can anyone tell me, how the correct expression should be ?

Thanks in advance

04-22-2010, 01:31 AM
You could try the following: in the HTML files instead of

<h4>A new beginning</h4>

you use

<h4 title="III. A new beginning">A new beginning</h4>

and see if it works.
I didn't try it in Calibre but in Sigil this is the way how to do it.

04-22-2010, 05:35 AM
I tried it, but either Calibre doesn't accept it (what I doubt) or my xpath is still wrong.

What Xpath expression do you use in Sigil?

04-22-2010, 05:36 AM
You can't use XPath to insert content.

04-22-2010, 05:37 AM
O.k. thanks Kovid.

And thanks Paulpeer

04-22-2010, 06:06 AM
What Xpath expression do you use in Sigil?

You don't need Xpath in Sigil. All h-headers (h1, h2 etc) automatically come into the table of contents, unless you explecitely tell Sigil not to do so.

04-22-2010, 11:16 AM
I tried it, but either Calibre doesn't accept it (what I doubt) or my xpath is still wrong.

What I do is put both in a single h2 statement and separate them with a <br /> tag. They end up on one line in the toc and two lines in the text.

Something like <h2>III<br />Chapter</h2>


04-23-2010, 12:59 PM
There's another way that might work with calibre. Define a selector that hides the text:

.hidethis {
visibility: hidden
Construct your heading as you'd like it to appear in the ToC, then hide the part you don't want to see in the flow of text.


<h3>III<span class="hidethis"> - A New Beginning</span></h3>
<h4>A New Beginning</h4>

Then use the h3 tag as the source of your ToC name.
This would work if calibre simply strips out the css from text that's used for ToC names. I think it's worth a try.

04-24-2010, 03:35 AM
Thank you Charleski

It works - but it gave me a new problem.

I tried it like you said, and the TOC really looks great, exactly like I wanted it to ...

... but in the flow text, the roman number is not centered anymore. Of course, that is just logic, because the hidden part is also inside the h3 tag.

So I tried to center just the roman number with a div tag, and this works, too. Now I am happy

04-24-2010, 03:20 PM
Instead of

.hidethis {
visibility: hidden


.hidethis {
display: none;

04-25-2010, 04:16 AM
thanks frabjous, this seems to work even better.

The roman number is still centered, and in the TOC, everything shows up.

This is much easier then what I had planned. I wanted to center the roman numbers with div tags (since span tags don't center) but div tags within heading tags is not xhtml valid. I thought about getting rid of the heading tags at all and just use div tags, but with your way, it is much easiert.

04-25-2010, 04:41 AM
When <div> tags are not allowed, you can use <span> with "display: block" ;)

04-25-2010, 05:20 AM
When <div> tags are not allowed, you can use <span> with "display: block" ;)

Yeah, I also found that out through google, but luckily, I don't need the div tags right now due to the display: none code.

But thanks for pointing it out Jellby.

02-20-2015, 06:04 AM
I write poems also Kovid. I have a e-book of one hundred and two of them called Picturesque many sites Scribd being one of them in the form of a PDF. In other formats the book is completely in jpeg pages of any other file format whether it is epub or mobi ect... My problem is to get it in the major book stores I need a working Table of Contents. When I try to put in the XPath Expression for Calibre to find the images and alt= "Name" in text the pages in the Table of Contents I cannot do a simple line of code which will work. Also after I get the pages into the Table of Contents I would like the name of the of the pages to become links. Can anyone that knows XPath code better than I do help?