![]() |
#16 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
as far as I know Xpath will match only single tags. If you have an expression that matches multiple tags, calibre will treat them as different entries in the TOC.
|
![]() |
![]() |
![]() |
#17 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
And there goes that idea. If the class name for the p- tags are identical and unique across the entire document, you might want to use those for matching instead. If not, use chapter numbers...
|
![]() |
![]() |
Advert | |
|
![]() |
#18 |
Addict
![]() ![]() ![]() ![]() ![]() Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
|
@Kovid: Thanks for clarification (and, of course, for the nice piece of SW!)
@Manichean: The <p> tags are not unique. Clearly, I can do the chapter numbers - the very first regexp in this thread does that reliably. I think I can do better than that though by processing of the XHTML by a simple python script that will first do the regexping of chapter numbers and then merging the matching tag with the subsequent tag. A better possibility would be to construct hidden toc but from this thread https://www.mobileread.com/forums/sho...d.php?t=105019 it seems that this is broken at the moment for mobi file output. Unfortunately, no nice generic solution seems to be available though, as far as I can see. |
![]() |
![]() |
![]() |
#19 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Your best bet (aside from the script you mentioned) is to try to convert to epub, edit the doc using Sigil to make it look like what you want, and then convert from epub to mobi. You could also use the debug output to grab an intermediate version of the html - the version from after pre-processing, edit that as required, then convert from html to mobi.
I just looked at the source code, the current code doesn't actually seem to include the case of a '.' appearing just after a numeric chapter header. If you open a bug with an example I can see that that case is added to the default preprocessing. bugs.calibre-ebook.com |
![]() |
![]() |
![]() |
#20 |
Addict
![]() ![]() ![]() ![]() ![]() Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
|
The Sigil program looks nice, I will try this route first, thanks for suggestion.
Not sure what do you mean by the second paragraph though. As far as I can say, if the dot is not present after the chapter number, the behaviour is exactly the same (I have just tested). I guess the problem is I don't know what kind of preprocessing do you mean here? Sorry, I am new to this - got my first ebook reader ever yesterday ![]() |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
<Command Line> Add multiple books in multiple formats | himitsu | Calibre | 8 | 09-25-2010 11:07 PM |
Bug: entries with multiple formats trigger multiple conversions | flinx1 | Calibre | 12 | 05-21-2010 06:23 AM |
Gen3 Multiple dictionaries? | miquele | Bookeen | 3 | 05-19-2010 04:16 PM |
Regexp and header/footer problems | concern | Calibre | 0 | 02-07-2010 03:35 AM |
I'm in line | Tangabird | Introduce Yourself | 4 | 11-12-2009 08:13 AM |