Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 10-30-2008, 11:25 PM   #1
hapax legomenon
Erotica Writer
hapax legomenon doesn't litterhapax legomenon doesn't litter
 
hapax legomenon's Avatar
 
Posts: 102
Karma: 106
Join Date: Jul 2007
Location: Tulsa, OK
Device: ipad, Sony Reader PRS 505, Cybook 3
best way to generate TOCs: 4 scenarios?

I am still not getting this ebook format conversion thing.

I know how to use Mobicreator, less familiar with Calibre, vaguely familiar with Book Creator.

Basically I need to take html files and make TOCs for them. Some sort of gui tool would be nice, or at least some tool that doesn't involve a lot of typing on command line.

here are some scenarios:

1)i have written static html chapters, and I need merely to create a main TOC. Preferably more than one levels and with illustration. these are my own writings, so my goal is clean code and ability to maintain over time.

2)I have used a tool to download recursively some html pages (such as pages from a website). like downloadthemall. The html is messy, has a lot of random code, but the links basically work. I need to create an index page.

3) I have chapters which are actually a bunch of text files generated by random people. would be nice to generate a toc.

4)i have a single html file with internal links/toc, but lots of graphics in a separate directory.

What is the best method for generating TOCs for each scenario?

On another note, is there a utility that gives you the ability to strip out certain kinds of code (javascript, etc) from html files?

Last edited by hapax legomenon; 10-31-2008 at 01:59 AM.
hapax legomenon is offline   Reply With Quote
Old 11-02-2008, 02:08 PM   #2
AZed
Connoisseur
AZed has a complete set of Star Wars action figures.AZed has a complete set of Star Wars action figures.AZed has a complete set of Star Wars action figures.AZed has a complete set of Star Wars action figures.
 
Posts: 57
Karma: 307
Join Date: Oct 2008
Device: PalmOS PDA
I'm considering writing a TOC generator for EBook::Tools v0.3, and the ability to strip <script> and <noscript> blocks already exists in v.02. Can you give me a few more details about exactly what you would want the TOC to link to for each case?

I'm going to make a few guesses about how I'd approach things, and you can tell me if it's what you're looking for.

1) Add IDs to every <h1> and <h2> element and link to those in the TOC? I have no idea what you mean by 'with illustration'.

2) I'm not sure what you would want the index page to point to. You want a way to extract all of the links from inside the messy HTML, and present a simple list of links on a clean page?

3) The TOC file would be one-level list of links to each file?

4) Again, not sure what you want your TOC to look like, or why the "internal toc" isn't working for you. How did you want the graphics to show up in the TOC (and why did you want graphics to show up in a TOC -- I don't think I've ever seen a TOC done that way).
AZed is offline   Reply With Quote
Advert
Old 11-02-2008, 08:18 PM   #3
hapax legomenon
Erotica Writer
hapax legomenon doesn't litterhapax legomenon doesn't litter
 
hapax legomenon's Avatar
 
Posts: 102
Karma: 106
Join Date: Jul 2007
Location: Tulsa, OK
Device: ipad, Sony Reader PRS 505, Cybook 3
the issue mainly is, what do you do with "found html" over which you have no control?

I'm talking about static html sites where there may not exist a clean TOC.

Are there any ways to autogenerate this kind of TOC? I come across hundreds of static html sites which I'd like to translate into a portable ebook format. In most cases, I just cut and paste the text, but I lose out on that.

The workflows described here assume the formatter has some control over what kind of html pages he has to deal with.

I guess HTML Tidy can clean up the code, maybe you can use XSLT to remove javascript and then autogenerate a TOC based on the title or H1 tag.

Are there any automated tools for doing this?

for one thing, I have a Sony PRS-505, and I don't know how to make ebooks out of a dozen html files (using Calibre for instance).
hapax legomenon is offline   Reply With Quote
Old 11-02-2008, 08:25 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
html2epub --help

you can autogenerate a multilevel TOC by select elements witht he full expressivity of XPath
kovidgoyal is offline   Reply With Quote
Old 11-02-2008, 08:47 PM   #5
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
With BookDesigner it can be a hit-or-miss thing. Sometimes it works fine and other times it will (on its own) decide that certain text lines are TOC elements. In most cases it is easy enough to correct these problems using 'Element Browser' under the Tools menu item. There is also a facility in BookDesigner that I use for generating the TOC -- under the Insert menu is 'Insert TOC (all)' that places a TOC at the start of the file just after the title and author. (It can be moved.).
RWood is offline   Reply With Quote
Advert
Old 11-02-2008, 09:42 PM   #6
=X=
Wizard
=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.
 
=X='s Avatar
 
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
It sounds like you just want to read Static HTML pages on your Reader. Have you looked into BookIt. This is a firefox plug in that turns a web page into a LRF eBook?

BookCreator is not really designed for creating ebook from different sources. Neither is BookDesigner.

If it is one book you are trying to create building it from multiple web sites. I think the best took is MobiCreate. Here you can load many different files into one OPF project. MobiCreate then turns them into HTML.

Once you have your eBook you can use Calibre to create the LRF book.

=X=
=X= is offline   Reply With Quote
Old 11-03-2008, 06:21 PM   #7
hapax legomenon
Erotica Writer
hapax legomenon doesn't litterhapax legomenon doesn't litter
 
hapax legomenon's Avatar
 
Posts: 102
Karma: 106
Join Date: Jul 2007
Location: Tulsa, OK
Device: ipad, Sony Reader PRS 505, Cybook 3
Wow, thanks for your comments! I totally had not heard of BookIt
hapax legomenon is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Generate Cover Mythlandia Calibre 27 06-27-2010 08:02 PM
Kindle Question 1: TOCs brewt Amazon Kindle 2 07-29-2008 06:40 PM
Spiritual Various: KJV Bible - reformatted as seperate books with TOCs johnmcelfresh BBeB/LRF Books 2 06-29-2008 12:08 PM
Are Nested TOCs Possible? VikingDave Sony Reader 11 12-28-2007 07:58 PM


All times are GMT -4. The time now is 05:32 PM.


MobileRead.com is a privately owned, operated and funded community.