Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 01-24-2023, 06:45 PM   #1
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Problems with Chapter detection

Hello,
I'm using Calibre V6.11 to convert plain text files to AZW3. The chapter heading in the text file typically looks like:
<h1> CHAPTER 13 Blue Skies by Joe Bloggs </h1>
The 'h1' tag appears ONLY in the chapter headings as given above. In the Calibre conversion options, these are my settings (I've simplified them to the barest minimum):

Structure Detection -> Detect chapters at = //h:h1
All other entries on the Structure Detection page are blank.

Table of Contents-> Force use of auto-generated TofC -clicked
No. of links to add to TofC = 70
Chapter threshold = 70

Level1 ToC = //h:h1

all other entries on the page are blank.

Yet despite the apparent simplicity and lack of ambiguity of these instruction entries, the chapter detection fails. Examining the html files shows this for the chapter heading:

<p class="calibre1">&lt;h1&gt; CHAPTER 13 Blue Skies by Joe Bloggs &lt;/h1&gt;</p>

Please tell me why is conversion not obeying my instructions?

Many thanks in advance!
rosewood is offline   Reply With Quote
Old 01-24-2023, 08:07 PM   #2
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,059
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
The string "<h1>" in a plain text file does not mean anything other than display exactly those characters. Because the brackets are special characters in HTML they become &lt; and &gt; so that they will display properly.

Perhaps try using markdown or HTML instead of plain text.
jhowell is offline   Reply With Quote
Advert
Old 01-25-2023, 09:21 AM   #3
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Thank you for your quick and very helpful response jhowell!

I will try out your suggestions and hopefully overcome my problem.
rosewood is offline   Reply With Quote
Old 01-25-2023, 02:12 PM   #4
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Attn jhowell
I'm having a hard time finding a good offline text to html converter utility program. I was wondering how does one make a request for increasing the functionality of Calibre, specifically to have a specific sequence of characters in text files which Calibre will translate into a Chapter in the ToC. For example, &H^6 Chapter 13 Blue skies /&H^6 where &H^6 is translated to be the start of a chapter heading and /&H^6 its matching terminator. Doing this would avoid having to go through the wasteful intermediate step of converting text to html, where the latter format gets discarded upon conversion to AZW3.
rosewood is offline   Reply With Quote
Old 01-25-2023, 02:59 PM   #5
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,858
Karma: 103895653
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Use LO Writer. edit/save as odt. Do extra Save As in Docx. Add the docx to Calibre. All paragraph styles are mapped to CSS. Text level = body mapped to <p> and other levels mapped to <h1>, <h2>
Or add text file to Calibre and use the Calibre Editor to make headings etc.
Quoth is offline   Reply With Quote
Advert
Old 01-25-2023, 04:13 PM   #6
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Thank you Quoth. Its downloading as I type this. Will follow your advice and see how I do.
But I still think that having the special signal word in .txt files to indicate chapters, as detailed in post #4, would be a great inclusion into the next version of Calibre
rosewood is offline   Reply With Quote
Old 01-25-2023, 04:18 PM   #7
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,683
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by rosewood View Post
Attn jhowell
I'm having a hard time finding a good offline text to html converter utility program. I was wondering how does one make a request for increasing the functionality of Calibre, specifically to have a specific sequence of characters in text files which Calibre will translate into a Chapter in the ToC. For example, &H^6 Chapter 13 Blue skies /&H^6 where &H^6 is translated to be the start of a chapter heading and /&H^6 its matching terminator. Doing this would avoid having to go through the wasteful intermediate step of converting text to html, where the latter format gets discarded upon conversion to AZW3.
Calibre can convert TXT to your preferred ebook-format, see ==>> https://manual.calibre-ebook.com/con...-specific-tips

Scroll down to TXT, and have a look at Formatting style: Markdown

BR
BetterRed is offline   Reply With Quote
Old 01-25-2023, 06:56 PM   #8
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Thank you BetterRed. I used an online txt to markdown converter (https://products.aspose.app/words/conversion/txt-to-md) to convert a couple of paragraphs of text into markdown and found the output indistinguishable from the input.

I then read the section on headers in the Markdown syntax link

https://daringfireball.net/projects/markdown/syntax

I found the above link in the link which you referred me to:

https://manual.calibre-ebook.com/con...-specific-tips

I have copy/pasted this header section (in bold italics) below:

Headers

Markdown supports two styles of headers, Setext and atx.

Setext-style headers are “underlined” using equal signs (for first-level headers) and dashes (for second-level headers). For example:

This is an H1
=============

This is an H2
-------------

Any number of underlining =’s or -’s will work.

Atx-style headers use 1-6 hash characters at the start of the line, corresponding to header levels 1-6. For example:

# This is an H1

## This is an H2

###### This is an H6

Optionally, you may “close” atx-style headers. This is purely cosmetic — you can use this if you think it looks better. The closing hashes don’t even need to match the number of hashes used to open the header. (The number of opening hashes determines the header level.) :

# This is an H1 #

## This is an H2 ##

### This is an H3 ######


So, would I be right in saying that if the Chapter name string in my text file is preceded with ##, the Calibre Preprocessor would add this name to its table of contents, or would I have to change the file name from filename.txt to filename.md for the Preprocessor to include the (## Chapter Name string) into the ToC?

Last edited by rosewood; 01-25-2023 at 06:59 PM.
rosewood is offline   Reply With Quote
Old 01-25-2023, 07:48 PM   #9
rosewood
Member
rosewood began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Jan 2023
Device: fire hd 10
Thank you all!

I followed your instructions, using the prefix ## for the chapter name string eg ## Chapter 13 Blue Skies
and setting XPath detection expression to “//h:h2” and the filename to filename.txt.

Calibre then correctly included the chapter name strings in the Table of Contents.
rosewood is offline   Reply With Quote
Reply

Tags
structure detection, table of contents


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Chapter detection for azw3 to <anything> snarkophilus Conversion 2 10-18-2020 05:06 AM
Help with chapter detection morgon Conversion 2 03-25-2016 11:14 PM
Help with Chapter detection ubergeeksov Calibre 0 09-02-2010 04:56 AM
xpath for chapter detection romnempire Calibre 7 07-26-2010 05:34 PM
Cant find help for chapter detection fallwood Calibre 6 12-10-2008 01:20 PM


All times are GMT -4. The time now is 01:24 PM.


MobileRead.com is a privately owned, operated and funded community.