View Single Post
Old 09-16-2009, 10:53 AM   #28
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,563
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I think it should be pretty obvious, the XML parsing is done by XMLStarlet, which uses XPath expressions (I had no knowledge of XPath until yesterday ). This is what is needed:

Open the META-INF/container.xml file. There should be a <rootfile> element with a full-path attribute. The value of this attribute is the path to the main OPF file.

Open the main OPF file. There should be a <spine> element there. The <spine> contains a list of <itemref> elements, each of them with a idref attribute. Get the values of these attributes in the order they are defined.

In the OPF file there should be a <manifest> element too. For each idref obtained in the previous step, there should be a <item> element inside the <manifest> with an id attribute identical to the idref. The href attribute of each <item> has the file path and name (relative to the directory where the OPF file is located).

Now you have the ordered list of all the files in the ePUB (actually, assuming there are no fallback items).

To get the "bookstyle.css": Find, in the OPF file, the <metadata> element, and inside it a <meta> element with an attribute name with the value "prince-style". The content attribute of this element is the id that you have to look for in the <manifest>, as done above for the items in the <spine>.

"default.css" and "output.pdf" are command-line or configuration arguments, those are not read from XML.
Jellby is offline   Reply With Quote