View Single Post
Old 06-25-2015, 06:57 PM   #4
gbm
Wizard
gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.gbm ought to be getting tired of karma fortunes by now.
 
Posts: 2,198
Karma: 8888888
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
Quote:
Originally Posted by lealla View Post
Hi erschwartz, thank you for your reply

Yes, I would like to split the file wherever there is the word chapter within the text itself (not within the coding).

I tried //*[re:test(., "chapter", "i")] but I got an error saying

"Cannot split on the
tag"

With the weird space between 'the' and 'tag'.

I don't know if this helps, but here is how the actual text is set up at the moment:

*EDIT - I just got rid of all the <span>+<div> tags, but this hasn't made any difference.
looks like this at the moment:
Here is the beginning of my html:



I'd like it to split the html at any point before chapter headings. It's not a great epub - converted from a PDF. I can't use any of the classes in the builder wizard (calibre2) ect, as they appear all over the place, and not just in chapter headings.

Would it be worth uninstalling and reinstalling calibre?

Thank you again for your help
Try this:
Code:
//*[((name()='p' ) and re:test(., 'chapter|book|section|part|prologue|epilogue\s+', 'i')) or @class = 'chapter']

bernie
gbm is offline   Reply With Quote