View Single Post
Old 09-09-2007, 07:40 PM   #1
phrodod
Enthusiast
phrodod began at the beginning.
 
phrodod's Avatar
 
Posts: 43
Karma: 28
Join Date: Aug 2007
Device: Sony Reader PRS-500
New conversion method: txt->rst->html->lrf

Hi all;

I've just gone through my first e-book creation experiment, and was looking for an easy way to convert the PG txt files to reader format. Restructured Text (rst) is a simple format designed to be both readable in text form and able to be processed into other formats automatically. It's the format used by Python's DocUtils package. One program included with that package, rst2html, can be used to convert lightly modified PG text files into HTML. I tried it out with Anna Karenina (by Tolstoy). Any feedback on the process is welcome, but I am happy with the result so far. (Of course, I'm only about 50 pages in on the reader...). If you'd like to view the results, check the reader downloads page.

I discovered that the process is actually pretty easy, but with a book as large as this one is, the Table of Contents (TOC) is difficult to navigate (many pages). So I went for a compromise. I split the original text file into separate files for each part, and had rst2html automatically generate a TOC for the part.

I then created a page of links to the other pages, ran the whole collection through rst2html to generate html pages, then used html2lrf to convert that to an e-book. I believe the results are quite nice.

The keys for this are: comfort with a good text editor (I use emacs), full python install (I installed Cygwin on my PC, and use the Python that came with it), docutils (search Google for the installer, then add it to your Python distribution, and comfort using the command line. I do all my conversion work here.

I'll post detailed instructions after I've done a couple of additional books.

Phrodod
phrodod is offline   Reply With Quote