View Full Version : HTML


Morrile
12-10-2010, 08:26 AM
Hello,

I am really puzzled learning about one the various eBook formats, in question HTML. :blink:

I was assuming like most other formats it would be a single file, sadly itís not; or am I wrong about all this?

I know there are applications for converting HTML to EXE, but how do devices which can view HTML prefer the HTML format? Is it an EXE file or a zipped file with many subfolders for each chapter and images?

Can someone please clarify this for me before I go insane :help:

Morrile

frabjous
12-10-2010, 11:32 AM
Try to give us some background about why you're trying to create an HTML ebook rather than using a standard ebook format like ePub or mobi.

If the document is nothing but text, then you can put it all in a single HTML file. If you need images, while there are ways of embedding such things directly in HTML (http://rifers.org/blogs/gbevin/2005/4/11/embedding_images_inside_html), I'd have no idea whether or not very many devices could handle that. It would probably work just to stick several files into the right places on the device; that would probably work, but would be a very inconvenient way of distributing an ebook. (And I'm not sure many library software programs could handle it. Calibre probably could if you zipped all the HTML files together with their images.)

This is precisely why formats such as ePub, etc., were developed. An ePub file is basically a zip file with (X)HTML files zipped together with images, other necessary files and a metadata record. By zipping many files into one, there's no need to distribute multiple files. You can in fact rename an *.epub file to *.zip and open it with unzipping software.

I imagine that most devices that allow you to view HTML files on them offer that capability mainly because it uses the same software to view other ebooks that contain HTML parts zipped inside of them, and it might as well give users the ability to read HTML files directly. But I don't think anyone ever wanted to encourage anyone to actually distribute ebooks in HTML format alone.

So again, I think it would helpful to know why you're trying to do this.

Don't put anything inside an EXE file. I can't imagine what the utility of that would be. Most of these devices use a linux or similar operating system and they cannot execute Windows binaries. (I don't really know what you mean about converting HTML to EXE anyway.)

Morrile
12-10-2010, 12:22 PM
Hello frabjous,

I have some books which are no longer in publication and I wish to create eBooks from them. I have gained permission, so no problems there. I plan to create ePub & Mobi, but I thought HTML was just as popular? perhaps I am wrong on this!?!? I got my information from Wikipedia.

I have them all in Word with a single index created by bookmarks and hyperlinks. Only a couple of the books have images, the majority are plain text.

If you have no idea about converting HTML to EXE, it's obviously something to avoid.

Morrile

Jellby
12-10-2010, 01:01 PM
HTML is not a proper ebook format, because it's not self-contained and distributed as a single file (not in its general form, at least). But other formats like Mobi and ePUB are based on HTML and can be easily created from well-formed HTML source.

Morrile
12-10-2010, 05:29 PM
Well that settles it then. :thanks:

PDF, ePub and Mobi, as that should cover the majority or eBook readers.

Thanks folks :2thumbsup

Morrile

BillSmithBooks
12-12-2010, 03:31 PM
HTML is not a proper ebook format, because it's not self-contained and distributed as a single file (not in its general form, at least).

I disagree.

HTML is a perfectly valid ebook format...and it is universal, unlike epub and mobi, which still require additional software to read on many devices. (What percentage of devices ship with a built-in epub or mobi reader?)

After all, HTML (and zipped folders with html files and images) are the universal standard of the entire web -- everyone can read a basic HTML file without any additional software beyond whatever their device comes with.

Since many (most?) books don't rely on images to convey information, there's often not a need for a zipped file, you can often publish a straight HTML file if you are willing to sacrifice the cover or include it as a separate JPEG or image.

But even most OSs come with a zip utility built in to open zip files.

Offering HTML in addition to epub, mobi and PDF is a great way to ensure that anyone can read your ebook no matter what device they have.

And HTML is a strong base to use for converting to other formats such as epub and mobi.

Morrile
12-13-2010, 03:15 PM
according to Wikipedia 'Comparison_of_e-book_formats' html is popular but how would you package up the files for ebook distribution?

frabjous
12-13-2010, 03:27 PM
I think it is mainly popular for reading online with a web browser, as well as for converting to other formats. I don't think it is very popular for people reading on portable devices. If you wanted to distribute a HTML version, you'd just put it on a webserver and use it like any other website.

gmw
12-26-2010, 08:02 PM
I think that HTML is underrated by some. Certainly packaging has its difficulties (since the content may have many files and is not compressed), but the widespread support makes it a useful format.

Indeed I've been looking around for something that will let me convert epub back into html. Calibre does not do it well (its convert to zip sort of does it but loses a lot more than necessary). The best I have been able to do so far is: expand the epub (it's just a zip file) and then use the free vHtmlMerger program to put all the html files into a single file ... and then edit that file to put back in the UTF8 header stuff and fix certain links etc. It's far from perfect but until I get time to write my own solution it seems the best I can do. (Note that Windows does have some built-in support for handling html files with a subfolder of used images etc, a well written utility could take advantage of this feature to make handling simpler for the user.)

Why would I want to convert epub to html? Because I wanted to produce PDF files from the epub, after altering certain aspects of the css (adding background etc). The neat thing about html, over epub and similar, is that you get to load the entire book as a single html page and skip around searching for things etc. It also means I can simply print to my PDFCreator virtual printer to produce the useful PDF file for my ereader. (Why PDF not epub? Because the epub viewer on the sony reader doesn't cope with background images while the PDF viewer does.)

Jellby
12-27-2010, 05:53 AM
If your goal is converting ePUB into PDF, have a look at these two threads:

A script (http://www.mobileread.com/forums/showthread.php?t=62939)
A script with GUI (http://www.mobileread.com/forums/showthread.php?t=89689)

gmw
12-27-2010, 06:17 AM
If your goal is converting ePUB into PDF, have a look at these two threads:

Thanks for the links. I did play with that script but lost patience when I couldn't get what I wanted quickly enough. Not a great excuse but there you have it. I even found some javascript for working with epubs but haven't had time to try and make good use of it yet. I still like to be able to get from epub to html because html is an easy format to work with when you're experimenting. The use of html is one of the reasons why I like epub, what it needs for ease of use is an easy way to go from multi-html-file to single-html-file and back again.

DaleDe
12-27-2010, 02:17 PM
Sigil can merge the various files of an ePUB together and automatically fix the images and other links. Once you have only one file it would be easy to convert to HTML.

gmw
12-27-2010, 07:32 PM
Sigil can merge the various files of an ePUB together and automatically fix the images and other links. Once you have only one file it would be easy to convert to HTML.

Indeed it would. I missed that feature when I looked briefly at Sigil - all it needs is a slightly improved interface to make it easier/faster to merge many into one (some of these epubs have a large number of html splits). Time I took a better look at that program, thanks.

condor
04-03-2011, 10:45 PM
I really like how epub text stays on the screen (of my Nook color). Regular HTML (version 4.01) doesn't seem to stay on the screen properly. For some odd reason, plain text seems to come out nearly microscopic in size.

So I was planning to make a plain as possible XHTML file with inline CSS and a table of contents that links to chapters. Basically I just want a document with internal hypertext links and formatting that allows the reader to change the text size as they see fit. Naturally I haven't done it yet.

In my opinion, if you start adding images you might as well stick to the epub format.

AZdave
04-04-2011, 04:44 PM
So the use of html depends on the ereader.

HTML is a mark-up language and the same code will appear differently on different machines. While pdf tries to look the same on all machines. (Crappy IMHO)

I have a BeBook (Onyx) it reads both epub and html. One problem I have seen with epub is the unzipping time can be significant for large documents, i.e. the King James bible. So I store a couple versions of the bible in html format. It is easier and quicker to start up.

Images are no problem. I have a Gutenberg KJV with Dore's illustrations in html. You just have multiple directories, just like the epub's underlay.

But I guess the same thing could be accomplished with multiple books of the bible in epub format.

On the BeBook html does have a disadvantage. It uses the wifi web browser to view html and there is no dictionary support for the web browsing.

A coin toss???

condor
04-08-2011, 02:44 PM
For some odd reason, plain text seems to come out nearly microscopic in size.

I'm not certain that my above statement was clear. I meant that the ".txt" files that I tried to read on my Nook Color came out with very small characters. But I am still trying to learn how to do things on it.

I love the HTML reader on the NC. I just wish I could force the content to stay on the screen. HTML docs seem to stay on the screen better when I'm in landscape mode.

One problem I have seen with epub is the unzipping time can be significant for large documents, i.e. the King James bible. So I store a couple versions of the bible in html format. It is easier and quicker to start up.

Maybe they need to come out with a variation of EPUB that uncompresses the table of contents and keeps each chapter compressed until you want to read it.

A feature to search within a compressed book (or compressed chapters) shouldn't be impossible, there are already utilities that let you search for a word in a compressed file.

susan_cassidy
04-17-2011, 02:51 PM
The unzipping is not done by the ePub file itself, but by the software reading it. There's no reason that the software couldn't keep the TOC in memory after unzipping, but once the file is unzipped, it would be strange to delete the unzipped copy. They should be keeping the unzipped file in a temporary storage area somewhere, and reading from that.

Most people reading novels never use the TOC, I bet. They just start at the beginning, and keep reading until they are done.

condor
04-22-2011, 12:45 PM
True enough, I don't bother with the TOC when I'm reading a book. I just mentioned unzipping specific chapters as a possible future method to handle large reference books, like the Bible, mentioned by AZdave.

Sorry about my poorly worded comments, I understand that NC uses Adobe Digital Editions to display epub files. I'm guessing that B&N selected ADE to handle digital rights management, which isn't a concern with me as I am busy reading public domain books. :)

kinkle
05-28-2011, 07:25 AM
I download some ebooks in .htm format. I coppy the books in kindle's document folder.
Who know a way to read this format on kindle? Please help.
It is possible to use experimental browser to read that .htm ?

DaleDe
05-28-2011, 12:43 PM
I download some ebooks in .htm format. I coppy the books in kindle's document folder.
Who know a way to read this format on kindle? Please help.
It is possible to use experimental browser to read that .htm ?

rename the extension to .TXT and it will be read.

Dale

frabjous
05-28-2011, 02:14 PM
I download some ebooks in .htm format. I coppy the books in kindle's document folder.
Who know a way to read this format on kindle? Please help.
It is possible to use experimental browser to read that .htm ?

You could use calibre to convert it to .mobi.

kinkle
05-28-2011, 02:21 PM
Thanks. I will tray.
With calibre I have some problem with malicious url that kaspersky report. I uninstal it. But i will try again. :chinscratch: