View Single Post
Old 10-20-2010, 06:06 PM   #4
alexdc
Member
alexdc began at the beginning.
 
Posts: 12
Karma: 10
Join Date: Aug 2010
Device: nook
I actually ended up figuring this out just by playing around a bit. Doing as follows you get a perfectly formatted, completely indexed ebook with virtually no effort:

Download website (already completed in my case, but if someone else is interested) using wget

Locate contents of book within download, which should be contained in a sub-folder

Delete all cascading style sheets (.css files) and any sort of text files within the folder (and this is where I was getting my problems. There was a .txt file within the download, so when I tried to convert the html later on I kept getting an excerpt from "Alice in Wonderland", as Calibre was trying to convert the .txt instead of the html. Later on, Calibre tried to preserve the original formating of the site, rather than wrapping the text/images. This was because of the cascading style sheets).

Rename folder with book contents to "HTML".

Zip folder. Rename archive to "Whatever you want the book to be called".zip.

Use Calibre to convert to desired format.

Perfect output!
alexdc is offline   Reply With Quote