View Single Post
Old 03-16-2010, 02:51 AM   #1
ficbot
Wizard
ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.ficbot ought to be getting tired of karma fortunes by now.
 
Posts: 2,409
Karma: 4132096
Join Date: Sep 2008
Device: Kindle Paperwhite/iOS Kindle App
Need help converting file which is too long to be HTML

I have a purchased ebook I am trying to convert from very messy HTML. The 'liberation script' I ran on it produced a very messy output and while on my old Sony it looked fine, it looked terrible when I tried mobipocket for my Kindle. My usual trick with this type of book is to open it in a web browser and copy/paste the text from there into Kompozer (my web program) to get clean HTML. However, the ebook (The Mists of Avalon) is VERY long and I seem to be running into some sort of file size limit. When I tried copy and pasting, it would not grab the whole thing at once, which was fine. But then I went to the spot it left off at and copied from there, it would not let me paste it into the HTML. I am thinking there must be some sort of file size limit for HTML I am unaware of?

I am wondering what my other options are for this book. I bought it awhile ago when there was no epub. I do not plan to deal with the format this book started in for the future, but this is the lone, last book of my collection of them that has to be converted and I would like to not have to buy it again. In the past I experimented with RTF but it appeared tiny when converted to LRF. Will I have this same issue with mobi? Or is there some other trick I can use to remove the extraneous garbage this HTML file seems to have?

I am on a mac. Software I am comfortable with and/or own includes Open Office, Kompozer, Calibre, Firefox/Safari and Pages.
ficbot is offline   Reply With Quote