![]() |
#1 |
Junior Member
![]() Posts: 3
Karma: 10
Join Date: Sep 2015
Device: none
|
detecting chapters in plain txt
Hi,
I would like to convert a plain (ascii) text file to epub and generate a toc. As this is a text file xpath does not make much sense (at least to me), but I can construct a regext that allows chapters to be detected. How can I teach calibre to use this regex to detect chapters when converting plain text to epub? many thanks! |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
The default xpath looks for these keywords:
(chapter|book|section|part)\s+)|((prolog|prologue| epilogue) |
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Connoisseur
![]() Posts: 69
Karma: 10
Join Date: Apr 2013
Device: Kobo Clara, Onyx Boox Monte Cristo
|
Quote:
I know the above is true for lines consisting only of a number, or the word Chapter or Part followed by a single word or a number. If the last character of such a line is a period it is treated as any other sentence in the text. |
|
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
you can use a regex inside an xpath expression. And as documented here: http://manual.calibre-ebook.com/conv...l#introduction
you need to look at the html produced by the intermediate steps in the conversion to figure out what xpath to use. |
![]() |
![]() |
![]() |
#5 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
The xpath sees an h1 tag. |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Connoisseur
![]() Posts: 69
Karma: 10
Join Date: Apr 2013
Device: Kobo Clara, Onyx Boox Monte Cristo
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#7 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
On account of xpath is where you would put a regex. This is beside the fact that your mention of heuristics operating on the text does not, in fact, help unless one is willing to rewrite the book. As I said -- that is markdown-to-html. Converting from one format to another. You have to have the book conform to those heuristics, not the other way around. |
|
![]() |
![]() |
![]() |
#8 | |
Connoisseur
![]() Posts: 69
Karma: 10
Join Date: Apr 2013
Device: Kobo Clara, Onyx Boox Monte Cristo
|
Quote:
As has been said before, there is more than one way to skin a cat. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Help with detecting chapters | vmd108 | Library Management | 7 | 07-30-2015 02:45 AM |
Aura Kobo Aura - chapters detecting | anthaet | Kobo Reader | 4 | 10-29-2014 03:41 PM |
detecting chapters with --markdown | p3aul | Conversion | 7 | 05-15-2011 11:01 AM |
azw to mobi: Not detecting chapters/page break at chapters and no TOC | RachDvn | Calibre | 3 | 01-16-2011 09:53 AM |
Detecting chapters | Tibor | Calibre | 4 | 01-17-2009 01:25 PM |