Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 12-02-2023, 02:30 PM   #1
gtriever
Addict
gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.
 
gtriever's Avatar
 
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
Split at multiple locations : Help Needed

I have a book with one large HTML file. Common chapter numbering in the Preview window is [Roman Numeral](space)[dash][dash](space)[Subtitle], for example II -- The Mystery Quest

Can anybody give me a regex expression to split that file at multiple locations, or do I have to continue to do it all manually? So far I'm entering the [specific Roman Numeral](space)[dash][dash] as my search term and cutting it at each spot using the split-click method.
gtriever is offline   Reply With Quote
Old 12-02-2023, 02:38 PM   #2
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
What is the code that surrounds "II -- The Mystery Quest"?

Is it a heading, like <h2>II -- The Mystery Quest</h2>?
Karellen is online now   Reply With Quote
Advert
Old 12-02-2023, 03:20 PM   #3
gtriever
Addict
gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.
 
gtriever's Avatar
 
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
Quote:
Originally Posted by Karellen View Post
What is the code that surrounds "II -- The Mystery Quest"?

Is it a heading, like <h2>II -- The Mystery Quest</h2>?
No. There isn't any separate chapter formatting, the entire file content is coded as <p class="calibre3">. Any file splitting will have to be based off the text string [Roman Numeral](space)[dash][dash]
gtriever is offline   Reply With Quote
Old 12-02-2023, 03:47 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Is that the only place with 2 dashes in a P tag
Code:
<p class="calibre3">(.+?\s--\s.+?)
Replace:
Code:
 <hr class="sigil_split_marker" /> <h3 class="chapno">\1</h3>
I happen to use Sigil for this task, but Calibre will work fine with a small adjustment.

The Marker is what you split on.
I made it a H3 tag (chapno is the stylesheet entry)
theducks is offline   Reply With Quote
Old 12-02-2023, 05:47 PM   #5
gtriever
Addict
gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.
 
gtriever's Avatar
 
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
Quote:
Originally Posted by theducks View Post
Is that the only place with 2 dashes in a P tag
Code:
<p class="calibre3">(.+?\s--\s.+?)
Replace:
Code:
 <hr class="sigil_split_marker" /> <h3 class="chapno">\1</h3>
I happen to use Sigil for this task, but Calibre will work fine with a small adjustment.

The Marker is what you split on.
I made it a H3 tag (chapno is the stylesheet entry)
My apologies, but I'm embarassingly lost by what you just wrote... in fact, in my original post I incorrectly asked for a regex expression when it should have been an XPath expression.
gtriever is offline   Reply With Quote
Advert
Old 12-02-2023, 07:23 PM   #6
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by gtriever View Post
My apologies, but I'm embarassingly lost by what you just wrote... in fact, in my original post I incorrectly asked for a regex expression when it should have been an XPath expression.
What @theducks is trying to accomplish is to make your headings unique so you can easily split the pages.
Currently your headings have the same styling as your paragraphs, so there is no way to distinguish one from the other.
Except for the double dash/hyphen. Because of this unique use, and if the use is limited to those headings only, then you can use that uniqueness to capture those headings and change the coding to <h2>heading</h2>. From there you can easily split the chapters into separate files.

So insert that regex into the Search box and make sure you have selected "regex" as the search type.
Though for the replace I would use a simpler <h2>\1</h2>
Step through the first few one at a time to make sure it works correctly, then if you have a lot of chapters you can just replace all.
Karellen is online now   Reply With Quote
Old 12-02-2023, 08:23 PM   #7
gtriever
Addict
gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.
 
gtriever's Avatar
 
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
Unfortunately, the double dash is used in the paragraphs as well. Oh well... thanks to all for the replies.
gtriever is offline   Reply With Quote
Old 12-02-2023, 08:37 PM   #8
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,104
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
How many chapters?
Karellen is online now   Reply With Quote
Old 12-02-2023, 08:51 PM   #9
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Code:
<p class="calibre3">([CLXVI]{1,7}\s--\s.+?)</p>
My Roman search added
theducks is offline   Reply With Quote
Old 12-03-2023, 09:11 AM   #10
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by gtriever View Post
Unfortunately, the double dash is used in the paragraphs as well. Oh well... thanks to all for the replies.
They should use en or em dash. But irrelevant.
Quoth is offline   Reply With Quote
Old 12-03-2023, 10:19 AM   #11
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,076
Karma: 412718
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Quote:
Originally Posted by gtriever View Post
No. There isn't any separate chapter formatting, the entire file content is coded as <p class="calibre3">. Any file splitting will have to be based off the text string [Roman Numeral](space)[dash][dash]
You could try Diap's Toolbag plug in to convert <p class="calibre3 ... to h4 and then use Split at Multiple locations with Xpath = //h:h4
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	32
Size:	88.0 KB
ID:	205020   Click image for larger version

Name:	Capture2.JPG
Views:	26
Size:	58.7 KB
ID:	205021  
phossler is offline   Reply With Quote
Old 12-03-2023, 12:29 PM   #12
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Quote:
Originally Posted by phossler View Post
You could try Diap's Toolbag plug in to convert <p class="calibre3 ... to h4 and then use Split at Multiple locations with Xpath = //h:h4
But apparently ALL the regular paragraphs are <p class="calibre3 etc.

And calibre will split an entire ebook into a file per paragraph. It happened once when an added book had a class for the body paragraphs that looked like a chapter heating to calibre. Reverted to original, renamed class and reconverted.
Quoth is offline   Reply With Quote
Old 12-03-2023, 04:35 PM   #13
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
The OP asked in Editor, So I gave a REGEX solution.
AFAIK xpath is for conversion
theducks is offline   Reply With Quote
Old 12-04-2023, 10:22 AM   #14
gtriever
Addict
gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.gtriever ought to be getting tired of karma fortunes by now.
 
gtriever's Avatar
 
Posts: 281
Karma: 2125576
Join Date: Sep 2010
Device: Kobo Forma
Let me rephrase what I'm looking for:

Is it possible to automatically split at multiple locations based off of a text string?

With all the paragraphs and chapter titles being the same <p class = "calibre3">, and my programming skills without the Wizard being nil, I don't see any way do it.

The common "chapter" reference is [Roman Numeral](space)[dash], which is what I've been using in the Editor Preview window to search and then manually split one chapter at a time.
gtriever is offline   Reply With Quote
Old 12-04-2023, 10:53 AM   #15
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,817
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
I gave you the tools to locate and insert a unique Marker.

You the simply use the split at multiple locations feature of the editor using the markers code as the criteria.
Use the Wizard in the tool to setup the x-path with ease
theducks is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Split PDF into Multiple files anonymust PDF 5 05-13-2020 05:21 AM
How to split multi HTML files at multi locations? Iceyogurt Editor 5 10-09-2018 11:12 AM
split docx into multiple xhtml files xanguera Conversion 14 08-01-2014 07:09 AM
Make books appear in multiple locations instead of creating hybrid folders Feather_Qwill Library Management 7 06-02-2013 04:45 AM
Multiple File Locations? Sydney's Mom Calibre 23 02-21-2011 07:13 PM


All times are GMT -4. The time now is 05:18 PM.


MobileRead.com is a privately owned, operated and funded community.