Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-01-2010, 07:01 PM   #1
gandor62
Connoisseur
gandor62 began at the beginning.
 
gandor62's Avatar
 
Posts: 99
Karma: 14
Join Date: Jun 2008
Location: Brisbane Australia
Device: iPod Touch, Ipad, Kindle 2
Getting Chapters detected Properly

I have a book in Epub format and am not getting table of contents properly.

The text in the book has the words Chapter One where the chapters are. What setting in Calibre would I need to setup to get the chapters correct. I am sure it can be done.

Help would be awesome

Thanks
Tony
gandor62 is offline   Reply With Quote
Old 06-01-2010, 07:09 PM   #2
Stinger
Asha'man
Stinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-books
 
Stinger's Avatar
 
Posts: 335
Karma: 844
Join Date: May 2010
Location: Canada
Device: Kobo
Have you tried simply selecting the checkbox 'Force use of auto-generated TOC'? (its under Table of Contents in the conversion settings.

If your chapter heading are in the form 'Chapter X', then Calibre should pick them up properly using the default XPATH it uses. I've seen many epubs that already have a dummy TOC with only the book title in there, which calibre won't overwrite unless you check the box I mentioned.

If the above doesn't work, report back and someone here will help you setup the XPATH expression that will work with your specific HTML structure.
Stinger is offline   Reply With Quote
 
Advertisement
Old 06-01-2010, 07:33 PM   #3
gandor62
Connoisseur
gandor62 began at the beginning.
 
gandor62's Avatar
 
Posts: 99
Karma: 14
Join Date: Jun 2008
Location: Brisbane Australia
Device: iPod Touch, Ipad, Kindle 2
I did check the Auto generated Table of Contents, but still failed. I wondered if there should be anything in the "TOC Filter, Level 1 TOC (xpath Expression):" Fields. 1,2,3 are blank.

I'm investigating Xpath but I think it may be beyond me.
gandor62 is offline   Reply With Quote
Old 06-01-2010, 07:40 PM   #4
gandor62
Connoisseur
gandor62 began at the beginning.
 
gandor62's Avatar
 
Posts: 99
Karma: 14
Join Date: Jun 2008
Location: Brisbane Australia
Device: iPod Touch, Ipad, Kindle 2
In the text of the Book I have this

CHAPTER ONE There was a faint smell of oil, turpentine and beeswax in the shop

I do not get a table of contents at all, I have check auto generate but when I view in Calibre the TOC button does nothing.

The original file was a text file that I converted to Epub, so I'm guessing that there are no headers H1 or anything in the text. I am assuming that this may be the problem. I suppose I was hoping that maybe Calibre read the text looking for the words Chapter and assumed that it was a break and input the headers for me.

I am not sure what to do to get the TOC to generate.

I have the file open in an editor and the following is what is there

<p>CHAPTER ONE There was a faint smell of oil, turpentine and beeswax in the shop

I see the problem now.

I think I need to put a header or something in so tell Calibre that there is a chapter break.

Does anyone have a clue as to what should be there. 20 chapters in this book so manually editing should not be a problem.

Last edited by gandor62; 06-01-2010 at 08:06 PM.
gandor62 is offline   Reply With Quote
Old 06-01-2010, 08:08 PM   #5
Stinger
Asha'man
Stinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-books
 
Stinger's Avatar
 
Posts: 335
Karma: 844
Join Date: May 2010
Location: Canada
Device: Kobo
Yeah, since you converted from a txt file, those tags are not present, which make auto-generating the TOC pretty simple.

Quote:
CHAPTER ONE There was a faint smell of oil, turpentine and beeswax in the shop
From this is looks like there isn't even a paragraph break between 'Chapter One' and that first sentence, which is just a bitch. You might be forced to manually open up the epub in Sigil, add a line break after the chapter heading, and then change the style of that heading to 'Heading #'. (You can't expect much from converting a plain text file).

However, since 'CHAPTER ONE' is capitalized, there might be a XPATH statement that could key to all caps, and the word Chapter. Alas, the complexities of XPATH is something I'm still trying to learn myself, so hopefullt someone more versed in that will pipe in here.

Quote:
I wondered if there should be anything in the "TOC Filter, Level 1 TOC (xpath Expression):" Fields. 1,2,3 are blank.
If you leave those fields blank, Calibre will use the default XPATH, which won't work for you since you have no H1 or H2 tags around your chapter headings. This is from the Calibre user manual talking about the default chapter detection it uses:

Quote:
By default, calibre uses the following expression for chapter detection:

//*[((name()='h1' or name()='h2') and re:test(., 'chapter|book|section|part\s+', 'i')) or @class = 'chapter']

This expression is rather complex, because it tries to handle a number of common cases simulataneously. What it means is that calibre will assume chapters start at either <h1> or <h2> tags that have any of the words (chapter, book, section or part) in them or that have the class=”chapter” attribute.

Last edited by Stinger; 06-01-2010 at 08:11 PM.
Stinger is offline   Reply With Quote
Old 06-01-2010, 08:14 PM   #6
Stinger
Asha'man
Stinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-booksStinger has learned how to read e-books
 
Stinger's Avatar
 
Posts: 335
Karma: 844
Join Date: May 2010
Location: Canada
Device: Kobo
Quote:
<p>CHAPTER ONE There was a faint smell of oil, turpentine and beeswax in the shop

I see the problem now.

I think I need to put a header or something in so tell Calibre that there is a chapter break.

Does anyone have a clue as to what should be there. 20 chapters in this book so manually editing should not be a problem.
Nice, if you have no issues manually editting it, here is what you need to do for calibe to autodetect the chapter headings by default. Just take 'CHAPTER ONE' text out of that P tag, and add a H2 tag around it. So it would look like this:

<H2>CHAPTER ONE</H2>
<p>There was a faint smell of oil, turpentine and beeswax in the shop ....

NOTE: When you try to reconvert the file, Calibre might default back to TXT->EPUB conversion, be sure to select EPUB as the input (top left hand side of the conversion window). Otherwise Calibre might just reconvert the text file like it did the first time, overwriting all your changes.


ALSO: Like I mentioned above, you can use Sigil to do these manual changes, if you don't want to fudge around with tags and such in a text editor. With Sigil, you can use the standard practice of just highlighting the chapter headings and applying the H2 style via a gui. Sigil will automatically add these entries to the TOC as well, so you won't need to reconvert after fixing it.

Last edited by Stinger; 06-01-2010 at 08:21 PM.
Stinger is offline   Reply With Quote
Old 06-01-2010, 08:33 PM   #7
gandor62
Connoisseur
gandor62 began at the beginning.
 
gandor62's Avatar
 
Posts: 99
Karma: 14
Join Date: Jun 2008
Location: Brisbane Australia
Device: iPod Touch, Ipad, Kindle 2
OK that was kewl. A crash course in Sigil to boot. Worked a treat. I'm a happy little Vegemite now.

Thanks Stinger, awesome help there mate.
gandor62 is offline   Reply With Quote
Old 06-01-2010, 11:37 PM   #8
tonyx3
Connoisseur
tonyx3 began at the beginning.
 
Posts: 55
Karma: 10
Join Date: Jan 2010
Device: Nexus One
Quote:
Originally Posted by Stinger View Post
Alas, the complexities of XPATH is something I'm still trying to learn myself, so hopefullt someone more versed in that will pipe in here.
Yeah, sometimes being forced to use the XPATH for chapter headings is a real pain.

Most of the time I wish I could just us a straight regex, like in the header removal. It's usually pretty easy to write regex that only matches the chapter headings, but often it wont mesh properly with the XPATH or something, and returns an error.

I see the value of the XPATH to match multiple attributes, but in some circumstances, a plain regex match would be much easier.

I wish there was an option to use either the XPATH, or a straight regex.
tonyx3 is offline   Reply With Quote
Old 06-02-2010, 12:02 AM   #9
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,883
Karma: 4200035
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"/AuraH2O
A little crash course in the use of "Markdown" when converting TXT files may also be helpful to you in future.

See Calibre User Manual http://calibre-ebook.com/user_manual...-specific-tips
e.g. a short relevant snippet ...

Quote:
Marking chapter headings with a leading # and setting the chapter XPath detection expression to “//h:h1” is the easiest way to have a proper table of contents generated from a TXT document.
i.e. Set your conversion "chapter detection" to "//h:h1" and manually edit your TXT to be
Code:
# CHAPTER ONE

There was a faint smell ...
or if you prefer, set your conversion "chapter detection" to "//h:h2" and manually edit your TXT to be
Code:
## CHAPTER ONE

There was a faint smell ...
"Markdown" has lots of possibilities if you have some time to investigate.
jackie_w is online now   Reply With Quote
Old 06-02-2010, 05:28 PM   #10
gandor62
Connoisseur
gandor62 began at the beginning.
 
gandor62's Avatar
 
Posts: 99
Karma: 14
Join Date: Jun 2008
Location: Brisbane Australia
Device: iPod Touch, Ipad, Kindle 2
Do any of the experts here know of a way to change text in a epub doc. I'm trying to use find and replace to set up chapters. its fine when its only 20 or so but a couple hundred is painful manually.

I want to change <p>Chapter one</p> to ##Chapter one and so on for one to two hundred.

I am unable to keep the numbers running.
gandor62 is offline   Reply With Quote
Old 06-02-2010, 07:02 PM   #11
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,100
Karma: 5939999
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by gandor62 View Post
Do any of the experts here know of a way to change text in a epub doc. I'm trying to use find and replace to set up chapters. its fine when its only 20 or so but a couple hundred is painful manually.

I want to change <p>Chapter one</p> to ##Chapter one and so on for one to two hundred.

I am unable to keep the numbers running.
Sigil
Search and Replace (will use Eyeball Mk I to detect other uses of the word "Chapter " <- note the space
Search: Chapter (with a space, Case sensitive, Exact words)
Replace: ##Chapter (with a space) Replace or find next, 100 mouse clicks or so, but it should be right.

Save

then convert again
theducks is online now   Reply With Quote
Old 06-02-2010, 10:31 PM   #12
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,883
Karma: 4200035
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"/AuraH2O
Quote:
Originally Posted by gandor62 View Post
Do any of the experts here know of a way to change text in a epub doc. I'm trying to use find and replace to set up chapters. its fine when its only 20 or so but a couple hundred is painful manually.

I want to change <p>Chapter one</p> to ##Chapter one and so on for one to two hundred.
Hi gandor,
There may be some confusion here. Labelling Chapters as level 2 headings using Markdown, with ## Chapter nn is done in the TXT file before conversion to EPUB in Calibre. You will also need to check the [Convert] - [TXT Input] "Process using Markdown" box during the conversion.
jackie_w is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
kindle dx not detected fms95032 Calibre 4 07-11-2010 08:59 PM
nook not detected on linux rikm Calibre 5 05-09-2010 12:11 PM
ePub Chapters vs. Stanza Chapters kjk Sigil 4 09-14-2009 11:50 AM
700 not being detected wiredout46 Sony Reader 6 05-13-2009 11:57 AM
SM Card not detected? SandDanz Fictionwise eBookwise 4 06-19-2008 11:46 PM


All times are GMT -4. The time now is 01:37 PM.


MobileRead.com is a privately owned, operated and funded community.