There has been some dicsussion on Gutenberg books not having any markup but for quite a while most new books have HTML versions with the markup from the original books. Many older text versions are being updated with HTML versions. For those old text ones that have no HTML, I really recommend GutenMark as it does a good job of putting back the markup and putting back in none ascii characters such as umlauts.
I've attached an ebook for the Sony Reader converted from a Gutenberg HTML using the perl script above and then put through Librie Toolbar to create a BBeB file (LRF). Nothing was edited in the files to produce this ebook. I haven't managed to figure out how to create Page Breaks with the toolbar though.
|