Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 11-02-2010, 11:57 AM   #16
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,359
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
as far as I know Xpath will match only single tags. If you have an expression that matches multiple tags, calibre will treat them as different entries in the TOC.
kovidgoyal is offline   Reply With Quote
Old 11-02-2010, 12:00 PM   #17
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
And there goes that idea. If the class name for the p- tags are identical and unique across the entire document, you might want to use those for matching instead. If not, use chapter numbers...
Manichean is offline   Reply With Quote
Advert
Old 11-02-2010, 12:16 PM   #18
janvanmaar
Addict
janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.
 
Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
@Kovid: Thanks for clarification (and, of course, for the nice piece of SW!)

@Manichean: The <p> tags are not unique. Clearly, I can do the chapter numbers - the very first regexp in this thread does that reliably.
I think I can do better than that though by processing of the XHTML by a simple python script that will first do the regexping of chapter numbers and then merging the matching tag with the subsequent tag.
A better possibility would be to construct hidden toc but from this thread https://www.mobileread.com/forums/sho...d.php?t=105019 it seems that this is broken at the moment for mobi file output.
Unfortunately, no nice generic solution seems to be available though, as far as I can see.
janvanmaar is offline   Reply With Quote
Old 11-02-2010, 12:43 PM   #19
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Your best bet (aside from the script you mentioned) is to try to convert to epub, edit the doc using Sigil to make it look like what you want, and then convert from epub to mobi. You could also use the debug output to grab an intermediate version of the html - the version from after pre-processing, edit that as required, then convert from html to mobi.

I just looked at the source code, the current code doesn't actually seem to include the case of a '.' appearing just after a numeric chapter header. If you open a bug with an example I can see that that case is added to the default preprocessing. bugs.calibre-ebook.com
ldolse is offline   Reply With Quote
Old 11-02-2010, 01:02 PM   #20
janvanmaar
Addict
janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.
 
Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
The Sigil program looks nice, I will try this route first, thanks for suggestion.

Not sure what do you mean by the second paragraph though. As far as I can say, if the dot is not present after the chapter number, the behaviour is exactly the same (I have just tested). I guess the problem is I don't know what kind of preprocessing do you mean here? Sorry, I am new to this - got my first ebook reader ever yesterday
janvanmaar is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
<Command Line> Add multiple books in multiple formats himitsu Calibre 8 09-25-2010 11:07 PM
Bug: entries with multiple formats trigger multiple conversions flinx1 Calibre 12 05-21-2010 06:23 AM
Gen3 Multiple dictionaries? miquele Bookeen 3 05-19-2010 04:16 PM
Regexp and header/footer problems concern Calibre 0 02-07-2010 03:35 AM
I'm in line Tangabird Introduce Yourself 4 11-12-2009 08:13 AM


All times are GMT -4. The time now is 07:37 AM.


MobileRead.com is a privately owned, operated and funded community.