12-02-2023, 02:30 PM | #1 |
Addict
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
|
Split at multiple locations : Help Needed
I have a book with one large HTML file. Common chapter numbering in the Preview window is [Roman Numeral](space)[dash][dash](space)[Subtitle], for example II -- The Mystery Quest
Can anybody give me a regex expression to split that file at multiple locations, or do I have to continue to do it all manually? So far I'm entering the [specific Roman Numeral](space)[dash][dash] as my search term and cutting it at each spot using the split-click method. |
12-02-2023, 02:38 PM | #2 |
Wizard
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
What is the code that surrounds "II -- The Mystery Quest"?
Is it a heading, like <h2>II -- The Mystery Quest</h2>? |
Advert | |
|
12-02-2023, 03:20 PM | #3 |
Addict
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
|
No. There isn't any separate chapter formatting, the entire file content is coded as <p class="calibre3">. Any file splitting will have to be based off the text string [Roman Numeral](space)[dash][dash]
|
12-02-2023, 03:47 PM | #4 |
Well trained by Cats
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Is that the only place with 2 dashes in a P tag
Code:
<p class="calibre3">(.+?\s--\s.+?) Code:
<hr class="sigil_split_marker" /> <h3 class="chapno">\1</h3> The Marker is what you split on. I made it a H3 tag (chapno is the stylesheet entry) |
12-02-2023, 05:47 PM | #5 | |
Addict
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
|
Quote:
|
|
Advert | |
|
12-02-2023, 07:23 PM | #6 | |
Wizard
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
Currently your headings have the same styling as your paragraphs, so there is no way to distinguish one from the other. Except for the double dash/hyphen. Because of this unique use, and if the use is limited to those headings only, then you can use that uniqueness to capture those headings and change the coding to <h2>heading</h2>. From there you can easily split the chapters into separate files. So insert that regex into the Search box and make sure you have selected "regex" as the search type. Though for the replace I would use a simpler <h2>\1</h2> Step through the first few one at a time to make sure it works correctly, then if you have a lot of chapters you can just replace all. |
|
12-02-2023, 08:23 PM | #7 |
Addict
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
|
Unfortunately, the double dash is used in the paragraphs as well. Oh well... thanks to all for the replies.
|
12-02-2023, 08:37 PM | #8 |
Wizard
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
How many chapters?
|
12-02-2023, 08:51 PM | #9 |
Well trained by Cats
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Code:
<p class="calibre3">([CLXVI]{1,7}\s--\s.+?)</p> |
12-03-2023, 09:11 AM | #10 |
the rook, bossing Never.
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
|
12-03-2023, 10:19 AM | #11 |
Wizard
Posts: 1,076
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
You could try Diap's Toolbag plug in to convert <p class="calibre3 ... to h4 and then use Split at Multiple locations with Xpath = //h:h4
|
12-03-2023, 12:29 PM | #12 | |
the rook, bossing Never.
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Quote:
And calibre will split an entire ebook into a file per paragraph. It happened once when an added book had a class for the body paragraphs that looked like a chapter heating to calibre. Reverted to original, renamed class and reconverted. |
|
12-03-2023, 04:35 PM | #13 |
Well trained by Cats
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
The OP asked in Editor, So I gave a REGEX solution.
AFAIK xpath is for conversion |
12-04-2023, 10:22 AM | #14 |
Addict
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
|
Let me rephrase what I'm looking for:
Is it possible to automatically split at multiple locations based off of a text string? With all the paragraphs and chapter titles being the same <p class = "calibre3">, and my programming skills without the Wizard being nil, I don't see any way do it. The common "chapter" reference is [Roman Numeral](space)[dash], which is what I've been using in the Editor Preview window to search and then manually split one chapter at a time. |
12-04-2023, 10:53 AM | #15 |
Well trained by Cats
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I gave you the tools to locate and insert a unique Marker.
You the simply use the split at multiple locations feature of the editor using the markers code as the criteria. Use the Wizard in the tool to setup the x-path with ease |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Split PDF into Multiple files | anonymust | 5 | 05-13-2020 05:21 AM | |
How to split multi HTML files at multi locations? | Iceyogurt | Editor | 5 | 10-09-2018 11:12 AM |
split docx into multiple xhtml files | xanguera | Conversion | 14 | 08-01-2014 07:09 AM |
Make books appear in multiple locations instead of creating hybrid folders | Feather_Qwill | Library Management | 7 | 06-02-2013 04:45 AM |
Multiple File Locations? | Sydney's Mom | Calibre | 23 | 02-21-2011 07:13 PM |