![]() |
#1 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 39
Karma: 714
Join Date: Jun 2015
Device: Kobo Aura H2O
|
What is the Xpath for "Split html at the word 'chapter"
Hi all,
Can anybody tell me what the xpath is for "Split html at any time the word "chapter" is found. I'm referring to an xpath code that can be used in the Calibre editing program, within the 'split at multiple locations' dialogue box? This is to find ANY occasion of the word chapter in an epub, regardless of if it is in a <h> <p> or <span> tag. Any tag. Thank you! I've tried and tried, but only failed so far! |
![]() |
![]() |
![]() |
#2 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Do you want it to split on the actual word "chapter"? I don't think you can do that.
Splitting on any tag node which contains the word "chapter": Code:
//*[re:test(., "chapter", "i")] P.S. calibre includes an XPath builder wizard. |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 39
Karma: 714
Join Date: Jun 2015
Device: Kobo Aura H2O
|
Hi erschwartz, thank you for your reply
![]() Yes, I would like to split the file wherever there is the word chapter within the text itself (not within the coding). I tried //*[re:test(., "chapter", "i")] but I got an error saying "Cannot split on the tag" With the weird space between 'the' and 'tag'. I don't know if this helps, but here is how the actual text is set up at the moment: *EDIT - I just got rid of all the <span>+<div> tags, but this hasn't made any difference. looks like this at the moment: Here is the beginning of my html: Quote:
Would it be worth uninstalling and reinstalling calibre? Thank you again for your help ![]() Last edited by lealla; 06-25-2015 at 08:08 AM. |
|
![]() |
![]() |
![]() |
#4 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,181
Karma: 8888888
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
|
Quote:
Code:
//*[((name()='p' ) and re:test(., 'chapter|book|section|part|prologue|epilogue\s+', 'i')) or @class = 'chapter'] bernie |
|
![]() |
![]() |
![]() |
#5 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,001
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
You can simply delete the entire (hidden) Calibre Configuration folder while calibre is NOT running. Preference: Miscellaneous: <the Button> to open that folder. It will re-create the defaults. BTW Are you discussing CONVERSIONS, not (hand) editing using the Editor? |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 39
Karma: 714
Join Date: Jun 2015
Device: Kobo Aura H2O
|
Quote:
That worked a little bit, in that it did split the document, but it didn't only split it on the word chapter. There are 19 chapters, but I ended up with 90 html files. I couldn't see any reoccurring themes - i.e. they all were split on different words, none that were a part of the code. It split on words such as "The" "She" "It". It did also split on the word Chapter though. As far as code goes, all the tags preceding the words are the same: </p><p class="calibre2"> Thanks to theducks advice, I deleted the calibre configuration folder before I started, but this didn't seem to make much difference either way (but was a good tip nonetheless.) Just to confirm, yes this is within the Calibre Edit Book area, I'm right clicking within the code area and selecting "Split at multiple locations" and then pasting the code into the dialogue box that pops up. Thank you for taking the time to help, I really appreciate it. I'm not too crash hot at all this, but I'm learning loads as I go. Last edited by lealla; 06-26-2015 at 04:02 AM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Split long words using the "¬" character (small screens) | DSpider | Workshop | 5 | 03-16-2012 07:09 AM |
George R. R. Martin's "A Dance With Dragons" to be split into separate books. | Exer | General Discussions | 4 | 04-02-2011 08:50 AM |
PDF to WORD/HTML conversion, "special characters and marks" errors | chengyibo | 3 | 11-06-2010 12:43 AM | |
MS Word "crap" at beginning of html files | PatNY | Sigil | 23 | 10-21-2010 06:22 PM |
Any way to revert the "Do No Split On Page Breaks" option? | dsana123 | Calibre | 2 | 07-10-2010 02:37 PM |