Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 02-27-2013, 09:13 PM   #1
Pranananda
Connoisseur
Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.
 
Pranananda's Avatar
 
Posts: 98
Karma: 122982
Join Date: Apr 2010
Location: Humboldt County, California
Device: ipad, iPod touch, JetBook Lite
mktoc.pl: create table of contents in HTML file

While there is software to create table of contents for ebooks, ie calibre creates a metadata table of contents, Sigil has support for this, as well as other converters, I mostly create my ebooks from HTML files and use calibre to create epubs or mobis. I have written a script that will create a table of contents where you want it, at the top level of the book, and for chapters and sections that have sub-chapters and sub-sections, that will link the elements of the table of contents to the child sections, with a link back to the parent table of contents. Here is the documentation from the script's -m option:

mktoc.pl creates a doubly-linked table of contents from a HTML files. The input is STDIN, the output goes to STDOUT. The command line syntax is:
./mktoc.pl < input.html > output.html

Do NOT use the same file name for the input and output file.

You will need to make the script executable before using it, or on Window you will have to use the follow command line syntax (after you download perl from http://www.perl.org) :
perl mktoc.pl < input.html > output.html

The entries in the table of contents come from the contents between a header tag and its end tag, i.e. <h1>...</h1>, <h2>...</h2>, <h3>...</h3>, and <h4>...</h4>.

The input ought to be run through tidy before using this script, though this is not exactly necessary. This script won't work if there are multiple header tags per line, for example:
<h1>Hello</h1><h2>This</h2><h3>Is a Test</h3> <!-- INCORRECT -->

mktoc.pl will also not produce good results if header tags were used for uses other than demarcating the beginning of a chapter or section. If header tags are used purely for formatting rather than document structure, the generated table of contents will be nonsensical.

mktoc.pl can be run multiple times on the same input file, that is, if you did the following:
./mktoc.pl < input.html > output.html
./mktoc.pl < output.html > newoutput.html
mktoc.pl can remove the traces of previous runs before applying its changes.

Also, traces of previous runs of mktoc.pl can be removed as follows:
./mktoc.pl -c < input.html > output.html

If a header has subheaders after it, those subheader will be placed into a small table of contents after the header declaration in the output file.

You can specify where the top level Table of Content goes, by including the following comment into your HTML input files:
<!-- toc -->
If you don't have this comment in your HTML input file, the top level table of contents will go right after the <body> tag.

Each table of contents placed in the output file will be surrounded by a <div class="toc">. Each line of the table of contents is a simple <p> tag. You may want to put an entry into your CSS for those elements, for example (and you can make your own style up here):
div.toc { margin: 1em 1.1em; }
div.toc p { margin: 0; text-indent: 0; }

If you would like to see an example of a table of contents that was created with this script (well, an older version), see The Life Science Health System.
Attached Files
File Type: pl mktoc.pl (15.6 KB, 325 views)
Pranananda is offline   Reply With Quote
Old 03-03-2013, 11:05 AM   #2
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Me too! me too!
A far less sophisticated toc-generator than Prananandas, written as a bash shell-script. Its single redeeming feature is its brevity. Good (?) to use as a starting-point for more sophisticated tocs, at least.
Code:
# Generate an html toc file
# Assumes one file - one chapter
# Assumes toc will be in same directory as chapter files
# Usage: toc <chapter files in correct order>
# the value of 'tag' is the classname of the <H2,3> - tag that contains the chapter heading, e.g. <h2 class="chapter">A NEW BEGINNING</h2>
# all other instances of <H2,3> are ignored
tag=chapter
cat <<__EOF__
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf8" />
<title>Contents</title>
</head>
  <body>
  <h2>Contents</h2>
  <ol>
__EOF__
n=0
for chapfile in $*
do
chap=$(grep -o "class=.${tag}.>[^<]*</h[32]" $chapfile)
chap=${chap#*>}
chap=${chap%</h?}
chap=$(echo $chap)
echo "    <li>" 
echo "      <a href=\"$(basename $chapfile)\" >" 
echo "        ${chap}" 
echo "      </a>" 
echo "    </li>" 
done
echo "  </ol>" 
echo "</body>" 
echo "</html>"
SBT is offline   Reply With Quote
Advert
Old 03-03-2013, 11:14 AM   #3
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Apropos toc's...

... a one-liner I find useful if I mess around with the NCX file. It renumbers the items in it if you've added/removed/moved items around:
Code:
awk 'BEGIN{n=1}{sub(/playOrder="[0-9]+"/,"playOrder=\""(n)"\"",$0) && n++ ;print}' toc.ncx
SBT is offline   Reply With Quote
Old 03-04-2013, 06:00 AM   #4
PageLab
Connoisseur
PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.PageLab ought to be getting tired of karma fortunes by now.
 
PageLab's Avatar
 
Posts: 70
Karma: 515184
Join Date: Sep 2011
Location: Brasília
Device: Kindle3, iPad, Nook, Kobo, Positivo Alfa
Very useful scripts, thanks! @Pranananda: Is it possible to consider only the <h1> tag and ignore other heading levels? Would be nice to choose just specific heading levels in the output.
PageLab is offline   Reply With Quote
Old 03-04-2013, 11:57 PM   #5
Pranananda
Connoisseur
Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.Pranananda is often consulted by the I Ching.
 
Pranananda's Avatar
 
Posts: 98
Karma: 122982
Join Date: Apr 2010
Location: Humboldt County, California
Device: ipad, iPod touch, JetBook Lite
@SBT, thanks for contributing your script. I've done some bourne shell scripting, but I have yet to do bash scripting. Looking at your code, it seems like bash has some nice features.

@PageLab, thanks for the kind note. I am considering how to implement the request you asked for.
Pranananda is offline   Reply With Quote
Advert
Reply

Tags
create table of contents, make table of contents, table of contents


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
how to create Table of Contents? Small Elephant Amazon Kindle 2 11-17-2012 01:17 AM
adding table of contents to html files jfs999 Conversion 2 09-30-2011 02:25 PM
Kindle: how create table of contents at the bottom line? ganymede Calibre 1 11-09-2010 01:43 AM
Create a table of contents? RobLikesBrunch Amazon Kindle 13 03-09-2009 07:59 PM
Is there a way to create a table of contents for notes timezone iRex 0 08-03-2008 03:54 PM


All times are GMT -4. The time now is 05:33 AM.


MobileRead.com is a privately owned, operated and funded community.