View Single Post
Old 07-01-2009, 11:42 PM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
In my experience with Lit files the chapters are rarely surrounded by hx tags. I have seen the 'chapter' class, but it usually increments the class for each chapter - 'chapter1', 'chapter2', etc. Will the default xpath still match when the class name is changing like that?

Of course the above is probably more true of the bootleg lit files. Professionally produced lit files do seem to follow guidelines some stricter guidelines, but professionally produced ones don't seem that common.

Usually when I edit the document I the first thing I'll check is whether the chapter delimiters are something that matches the default chapter regex. Then I'll make sure those are surrounded by H1 or H2 tags. A lot of books don't use Chapter xx though, they just use a word or a phrase. Usually it's pretty simple to write a regex to find all of those breaks though, then surround them by the hx tags. Then I'll just change the xpath to something like this:
--level1-toc=//*[((name()='h1' or name()='h2') and re:test(., '.*', 'i')) or @class = 'chapter']

The key to using that simpler regex is to make sure that you're not overlapping with other uses of h1 or h2 tags, which are very common in the title page. So I'll go with h3 or h4 instead. Probably easier just to specify chapter classes now that I think about it though....

That whole process I described is why I'd love to be able to use the GUI to convert to uncompressed OEB to better facilitate edits like this. I know Calibre doesn't support the idea of a folder as a type of book, but one option would be to just not add the OEB format to the library, just dump it to the filesystem. Think of it as an 'export to OEB' option instead of a 'convert to OEB' option.

Last edited by ldolse; 07-01-2009 at 11:47 PM.
ldolse is offline   Reply With Quote