Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 08-14-2023, 07:52 AM   #1
citrate
Junior Member
citrate began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2023
Device: BOOX Note 5+
TXT to EPUB: How to Make Chapter Markers as Titles

Hello there. I have a novel in .txt format and I want to convert it to .epub format. Since there's no tags decorating chapter markers (hashtags or <h2> tag) in the original text, I used regular expression (`//*[re.test(., "<regexp>", "i")]`) in structure detection to find out all the chapters.

The chapters are detected correctly, however, the chapter markers are treated as ordinary text. Is it possible to let Calibrie understand that the detected chapter marker are titles/headers?

For example, If I have something like:

```
Chapter 1 Chapter Name

some text some text
```

I want it be something like

```
# Chapter 1 Chapter Name

some text some text
```

in the .epub output. That is, give the chapter marker/title a different format from ordinary text.
citrate is offline   Reply With Quote
Old 08-14-2023, 08:02 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
txt is internally converted to HTML and that is what the expression applies too. You can see the HTML using the debug section of the conversion dialog.
kovidgoyal is offline   Reply With Quote
Old 08-14-2023, 08:11 AM   #3
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,675
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by citrate View Post
Hello there. I have a novel in .txt format and I want to convert it to .epub format. Since there's no tags decorating chapter markers (hashtags or <h2> tag) in the original text, I used regular expression (`//*[re.test(., "<regexp>", "i")]`) in structure detection to find out all the chapters.

The chapters are detected correctly, however, the chapter markers are treated as ordinary text. Is it possible to let Calibrie understand that the detected chapter marker are titles/headers?

For example, If I have something like:

```
Chapter 1 Chapter Name

some text some text
```

I want it be something like

```
# Chapter 1 Chapter Name

some text some text
```

in the .epub output. That is, give the chapter marker/title a different format from ordinary text.
What novel is this?
JSWolf is offline   Reply With Quote
Old 08-14-2023, 08:39 AM   #4
citrate
Junior Member
citrate began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2023
Device: BOOX Note 5+
Quote:
Originally Posted by kovidgoyal View Post
txt is internally converted to HTML and that is what the expression applies too. You can see the HTML using the debug section of the conversion dialog.
Yes, the .txt file is internally converted to HTML file. It becomes something like

```
...
<p>final sentence of last chapter</p>
<p class="whitespace" style="text-align:center; margin-top:0em; margin-bottom:0em">*</p>
<p>Chapter X</p>
<p>first sentence of current chapter</p>
...
```

After detecting structure, it becomes

```
...
<p>final sentence of last chapter</p>
<p class="whitespace" style="text-align:center; margin-top:0em; margin-bottom:0em">*</p>
<div style="display: block; page-break-after: always"></div>
<p id="calibre_toc_X">Chapter X</p>
<p>first sentence of current chapter</p>
...
```

But what I want is something like

```
...
<p>final sentence of last chapter</p>
<p class="whitespace" style="text-align:center; margin-top:0em; margin-bottom:0em">*</p>
<div style="display: block; page-break-after: always"></div>
<h2 id="calibre_toc_X">Chapter X</h2>
<p>first sentence of current chapter</p>
...
```

Pay attention to the line of Chapter X, I want it to become a title. Is it possible to achieve this in Calibrie?
citrate is offline   Reply With Quote
Old 08-14-2023, 08:41 AM   #5
citrate
Junior Member
citrate began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2023
Device: BOOX Note 5+
Quote:
Originally Posted by JSWolf View Post
What novel is this?
Well... It's a Chinese web novel and the text I provided is just an example of what I want rather than the real content of it.
citrate is offline   Reply With Quote
Old 08-14-2023, 08:55 AM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
you want something like

Code:
//h:p[re.test(., '^Chapter.*', 'i')]
kovidgoyal is offline   Reply With Quote
Old 08-14-2023, 09:05 AM   #7
citrate
Junior Member
citrate began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2023
Device: BOOX Note 5+
Quote:
Originally Posted by kovidgoyal View Post
you want something like

Code:
//h:p[re.test(., '^Chapter.*', 'i')]
Well it doesn't work. I think above XPath expression is "find all <p>-tag text that matching the regular expression", but what I want is
  1. Find all text matching the regular expression
  2. Change the <p>-tag of the matching text to an <h>-tag
citrate is offline   Reply With Quote
Old 08-14-2023, 09:23 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,598
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You cannot change tags, but using the correct expression will allow you to mark those tags as chapter starts with either a pagebreak or a rule above.
kovidgoyal is offline   Reply With Quote
Old 08-15-2023, 01:56 AM   #9
Ghostcat
Connoisseur
Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.Ghostcat ought to be getting tired of karma fortunes by now.
 
Posts: 63
Karma: 582370
Join Date: Apr 2023
Device: Kobo Clara 2E
Why not use <b>Search and Replace</b> to change <p> to <h2>?

E.g:

search for "<p>(Chapter [0-9]+)</p>"

replace with "<h2>\1</h2>"
Ghostcat is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Epub to Mobi with apostrophes in chapter titles Applebaker Conversion 4 08-04-2017 07:29 AM
Chapter Markers jakann86 Kindle Formats 5 09-01-2013 09:32 AM
Converting txt to epub, chapter heading contains number NMinker Conversion 2 11-01-2011 11:30 AM
Converting EPUB to MOBI - missing chapter markers peartree Amazon Kindle 10 04-01-2011 06:02 PM
Chapter Markers? djulian Calibre 3 11-20-2010 11:15 PM


All times are GMT -4. The time now is 02:04 PM.


MobileRead.com is a privately owned, operated and funded community.