Quote:
Originally Posted by Manichean
First of all, thanks for the conversion. I had a (relatively quick) look at it, this is really a nicely done book.
I do have a few technical questions though. I have two books with (I assume) similar source material, they are reference books consisting of many html pages with images and some active content (lookup etc.). I know that I'll have to remove the active content, but beyond that I'm really pretty clueless as to what tools I should use for the conversion. I'd like to get a toc like yours, where you first select the character and then get a list of topics starting with that character, but I don't know how to do that (apart from manually writing the html page, but that would be a major pain in the ass).
So, my questions are:
- What did you use to parse the html files? I'm assuming some scripting language?
- What program did you use to build the Mobi-file from the multiple html files?
Thanks in advance for your answers.
|
For the base conversion I wrote scripts for jflex, which then created Java code. The scripts were basically a list of regular expressions and some Java code to execute when the regular expression is found (in the source html file). If you know Java and regular expressions, you can use jflex ( or C and flex, for that matter).
For the finishing touches I used Textpad. It can use regular expressions for the search functions, as well as work on several hundred open files at once.
The TOCs didn't quite have to be done by hand. One of the appendices already had one. After changing it to a form I prefer, I copied it to the other files. The anchor tags did have to be put in by hand, though.
I then used Mobipocket Creator to make the ebook. The user interface leaves something to be desired, but given that it saves you the effort of manually creating the OPF file, it's not bad.