View Full Version : A nexus in eBook formats ?


SerialAeon
05-29-2006, 03:02 PM
Hello,
I'm "doing" computer stuffs since 15 years now I got tired of the endless format wars: pictures, videos, and now... eBooks. There is almost a dozen actively fighting formats for now, almost all platform-specifics. My question is the following: in my mind, an ideal solution would be to store the eBooks in a simple, open format that store the text, the layout (titles, chapters, sections, etc.) and any possible meta-data (type of book, etc.). Then, to offer a conversion pipeline from/to this 'nexus' format to any other commercial/proprietary/(add any wanted adjective) format.
For now, when looking at repositories like Project Gutemberg, this nexus format seems to be ... plain text, which is not very satisfying. As usual, this kind of project started when nobody cared about metadata or text layout, and now it will be a pain (according to my experience; I'm maybe wrong ?) to propose them on such a big repository.

My question is: does an ideal universal format exists ? As a biologist, I know by experience that the Darwin theory of evolution will (again) apply, and that in few years only one or two format will stay, all the other being dropped in the trash can of evolution. These one or two remaining formats will then become the de-facto universal ones. But I'm interested by the opinion of guru here that certainly know much more than me on this subject.

Aurelien

TadW
05-29-2006, 04:28 PM
The OEBPS Open eBook framework could fit in this category:

http://www.openebook.org/oebps/history.htm

Microsoft for instance is using it for its .LIT format.

Antoine of MMM
06-08-2006, 03:52 PM
Well, one could just use the existing CSS/HTML combo and make it happen too. I think that would bode not only better for users (imagine a similar look and feel), but then creators would really have a better control of multiple formats via a one shot deal.

Check out this site: http://www.alistapart.com/articles/boom

While I personally wouldnt care for the step that takes it into PDF format, the fact that CSS/HTML already allows for this makes for a neat one step solution for those looking to publish electronically, as well as in print.

bowerbird
09-25-2006, 02:38 PM
i've developed a format called "zen markup language"
-- z.m.l. for short -- that will transform raw-ascii e-texts
like those from project gutenberg into powerful e-books.

i'll be revealing it soon.

meanwhile, you can look at john gruber's "markdown"
-- http://www.daringfireball.net/projects/ -- to see
a similar form of non-markup markup. markdown is
interesting because it bridges to xhtml, whereas z.m.l.
is meant as a standalone format displayed by a viewer
that i have programmed.

-bowerbird

yvanleterrible
09-25-2006, 03:06 PM
Believe it or not Photoshop was first created to be such a beast...somewhere...far away...in some distant past...not on an other planet...

Steven Lyle Jordan
10-12-2006, 04:34 PM
I believe HTML covers that need right now... even more conveniently, HTML is now used to readily convert into a number of other e-book formats.

geekraver
10-19-2006, 02:27 AM
TeX/LaTeX has been around for many years, and is powerful enough to represent pretty much anything.

ath
10-19-2006, 04:10 AM
My question is: does an ideal universal format exists?

The answer is, clearly, no. At least, not until 'ideal' has been suitably defined.

The Text Encoding Initiative (http://www.tei-c.org/) is mainly for scholarly uses, and that may seem off-putting. It's mainly a base, suitable for novels and other types of texts, to which special markup (line and page identification, speaker identification, emendations, etc.) added on top.

Early versions were based on SGML, but I believe there is (or has been) an effort to convert to XML. There's no major difference on the markup level, though I believe there are on the DTD-level.

Check out their Guidelines -- and be sure to start with TEI Lite. But you *do* need an XML environment to work in, and particularly one that does check your files for markup errors.

If you want to have a more extensive example of a marked-up file than those that appear in the TEI Lite Introduction, try the Oxford Text Archive (http://ota.ahds.ac.uk/): their preferred formats are TEI-based. Their web site seems heavily frame-based, so I won't give a link, but you may look for works by Anthony Trollope -- I've just verified that the first one listed (Ayala's Angel) uses TEI Lite markup.

Studio717
10-19-2006, 03:50 PM
I'd say HTML, too, since it's fairly platform independent. PDFs are also readable by most platforms that I'm aware of (perhaps not on all the smaller devices, but there is a PDF reader for my Palm), but PDFs are much more restrictive, ime, than HTML docs.

(For years I bought books in Mobipocket format and now regret it. I'm hoping to be able to find a Mobi -> HTML converter. Keyword there is 'hoping.')

mikecook
06-05-2007, 01:27 PM
I know it's been a while since this thread was active but I wanted to ask another question that relates quite closely to this.

I've been researching theses master formats for sometime and it does seem that TEI, OEBPS, XML, BookX could be good contenders...deciding which to use is not that easy.

Publishers such as O'Reilly use DocBook for their technical manuals, but does anyone know if there is standard format that is used for novels (fiction, non-fiction, etc)?

Thanks,

~mike

mikecook
06-05-2007, 02:35 PM
REMOVED...

I lost my connection when I originally posted but when I look to see if it had worked it didn't show up here, so I rewrote it again. My appologies

Robert Marquard
06-05-2007, 02:48 PM
In this case it is really easy. OEBPS Open eBook framework. Best the currently developed 2.0 version. It defines an .epub format which is a Zip containing HTML with supporting files in XML. Mobipocket already uses it as source format. LIT does the same (using the older format).
If fact most of the ebook formats are simply repackaged XHTML.

JSWolf
06-06-2007, 03:18 AM
In this case it is really easy. OEBPS Open eBook framework. Best the currently developed 2.0 version. It defines an .epub format which is a Zip containing HTML with supporting files in XML. Mobipocket already uses it as source format. LIT does the same (using the older format).
If fact most of the ebook formats are simply repackaged XHTML.
The problem is not the that OEBPS is out there, but the problem is the overlaying format. My Sony Reader cannot handle Mobipocket or LIT (without conversion) for example. So it would not matter if the books were hand typed. What we need is a format whereby we can purchase a book and then convert it to whatever program or device we are using at the time so it won't become obsolete. For example, you have a palm, you buy books for this palm. All is well. The palm breaks. You have some books left that you have not yet finished reading. You decide to purchase a Sony Reader or one of the other forthcoming eink based readers that does not support the books you have already bought. What can you do other then purchase these books in a format your new device supports since the DRM has made it impossible for you to convert these books to another format.

Now if we had the books in OEBPS and a program with the new device to read in OEBPS and output what format it is we need, then what we have purchased in the past is still viable in the future. This is NOT like CD or DVD where people by choice decided to purchase the same music or movie that they had on LP, cassette, or VHS while they still had a turntable, cassette deck, or VHS recorder. This is either you stick to the old technology and can't use the new technology because your investment in the ebooks will be lost or you invest in the new technology and either do without the ebooks you have or you have to try to find them in this other format and repurchase what you can.

We don't need to lose the money we've invested in ebooks just because we want a different reading device. This is one of the major reasons why eventually, ebooks will fail and fail big time. Yes, I know I can purchase LIT format ebooks and then convert to HTML or LRF or a number of other formats, but the average person such as my mother could not do that. I could do it for her, but without me to help, she'd be buying a book that was only readable on the device she had at the time. Let's say this device breaks and is not worth getting fixed and the device she gets next won't support her ebooks, then her investment is lost. The other major issue with the DRM is lets say she gets another device that is compatible with the format of the books she once bought, but she goes to load the books she has on this new device and while they may load, the DRM will say "this is not the same device, I won't let you view your ebooks". Now what is she to do? She'd have to go back to every online shop she got the books from and download them again. Then have to deal with the fact that there are now two copies on the computer of each book.

Basically, while this isn't too much of an issue for us geeks, it is an issue to the regular people. Come on ebook industry, stop jerking us around and get your heads out of your asses and give us something we can actually use. I have a Sony Reader now. But I cannot say that 5 years down the road, I won't find another reader that suits my needs at that time better. If I was to have purchased all my books from the Sony Connect store and the new reader doesn't support them, then all my money is gone forever.

Why is it that most new readers are coming with new formats for ebooks? Sony with BBeB (LRF) and the V2/V9 with this Wolf format. We don't need new formats. We need to standardize or make an open convertible format. Ebooks are going to die and our readers will be just one more piece of junk we'll end up tossing out as they become obsolete.

We need to figure out a way to let the industry know that we won't stand for the shot they give us. Remove the DRM so at least converters could be written. Don't jerk us abount like we were your dick. Treat us like human beings. Remember when the copy protection on Lotus 1-2-3 caused such a problem that enough people complained so Lotus dropped the copy protection. We need to do the same so the first thing they do is drop the DRM. The DRM does us no good. All it does is make screw us over but good.

Jon

Robert Marquard
06-06-2007, 04:51 AM
This is why i only have books without DRM.

mikecook
06-06-2007, 08:25 AM
Thanks Robert, this is good information to know.

It also looks like the v2.0 framework is close to being approved so now could be a good time to start learning about the OEBPS format.


Now, Jon...

This is NOT like CD or DVD where people by choice decided to purchase the same music or movie that they had on LP, cassette, or VHS while they still had a turntable, cassette deck, or VHS recorder. This is either you stick to the old technology and can't use the new technology because your investment in the ebooks will be lost or you invest in the new technology and either do without the ebooks you have or you have to try to find them in this other format and repurchase what you can.

Sorry but I've heard this argument before and it still makes no sense! If your turntable/CD player/iPod (i.e. ebook reader) is old technology and you wish to go out and purchase the all-new-fangled device then yes, you DO have to go out and purchase all your favourite music (eBooks) again. This is the way the world works.

But I do agree with most of what you're saying -- When you buy a CD, you expect (and know) it will play in any manufacturers CD player. The same thing should apply with eBooks. Maybe OEBPS is taking us a step close.

JSWolf
06-06-2007, 11:26 AM
Thanks Robert, this is good information to know.

It also looks like the v2.0 framework is close to being approved so now could be a good time to start learning about the OEBPS format.


Now, Jon...



Sorry but I've heard this argument before and it still makes no sense! If your turntable/CD player/iPod (i.e. ebook reader) is old technology and you wish to go out and purchase the all-new-fangled device then yes, you DO have to go out and purchase all your favourite music (eBooks) again. This is the way the world works.

But I do agree with most of what you're saying -- When you buy a CD, you expect (and know) it will play in any manufacturers CD player. The same thing should apply with eBooks. Maybe OEBPS is taking us a step close.
But, using the CD as an example, if I have the ability to play an LP or a cassette, I can then convert these into a CD using a stand alone CD writer or my computer so I still have the music I purchased in the newer technology. Also take blu-ray DVD for another example. I can still play my existing DVDs. The problem is if I was to someday down the line replace my Sony Reader, there is no guarantee that I would be able to use the books purchased for it in the new reader. They expect us to either repurchase what we already have because we need a new format or be tied to the old device (as long as it works). This is just as stupid as saying.. you can read your paperback book now but if in the future you need glasses or your prescription changes, you'll need to purchase a new paperback copy to be able to read said book.

bowerbird
06-09-2007, 04:33 PM
i mentioned my "zen markup language" earlier in this thread.

it was designed with the ideal of taking a plain-ascii e-text
-- exactly like the ones that you find in project gutenberg --
and (via some automatic handling from the viewer-program)
transforming it into a powerful fully-functional electronic-book.

another benchmark is turning it into .html for view on the web.

you can now go to a website to see z.m.l. in action:
> http://www.z-m-l.com

on the "examples" screen, you can click on the links to
see the plain-ascii z.m.l. file that is used as the "master".

you can also click each button to see the .html file that is
automatically generated from that plain-ascii master file...

from that .html, of course, you can generate a plethora
of other formats as well. the point is that the "master"
is in a plain-ascii format, and thus can be distributed with
all of the ease and convenience that such files give to us.

(production and maintenance of files in this format is also
a breeze, owing to this plain-ascii nature, since there are
tons of tools out there capable of handling files like those.)

in the evolution of e-book formats, some light markup format
-- whether it be z.m.l. or "markdown" or "textile" -- will win...

the days of heavy markup -- like docbook or .tei or .oebps --
are finished. they are too complex for the average person,
and us little guys are not about to sacrifice the revolution that
the web gives to us, which is that _anyone_ can be a publisher.

-bowerbird

Azayzel
06-09-2007, 08:32 PM
That's actually a pretty snazy way to reformat plain ascii to something much more useful. I was originally thinking that XML would be a great way to preserve content and reformat at a later date, but it would be nice if there was an app that, based on what it encounters in the text, reformats to a desired output. HTML seems where everything is going in the current period, just take a look at all of the "web" devices that are being marketed/created.

bowerbird
06-13-2007, 07:15 PM
azayzel said:
> That's actually a pretty snazy way to reformat plain ascii
> to something much more useful.

thanks.

i'm curious as to whether my .html serves well when
converted to the various formats for handheld devices.

i'd appreciate any feedback anyone had regarding that.

-bowerbird

nekokami
06-14-2007, 12:06 PM
Sorry but I've heard this argument before and it still makes no sense! If your turntable/CD player/iPod (i.e. ebook reader) is old technology and you wish to go out and purchase the all-new-fangled device then yes, you DO have to go out and purchase all your favourite music (eBooks) again. This is the way the world works.
Er... no. I was able to convert many of my LPs to cassette, and then CD format. I only bought CDs where the quality of the music mattered enough to me that the higher fidelity of the CD offered enough extra value to justify the expense. And Apple has made it quite easy to convert CDs to play on the iPod.

In any case, the two aren't really comparable, because, apart from sheet music, there has never been a long-lived portable storage mechanism for music, so people accept limitations as we try to find one. The same is not true for books, which have been around for thousands of years. The "extra features" of ebooks (e.g. low weight/volume, searchability) have not yet provided enough value for people to be willing to accept having to replace them whenever the manufacturers come out with a new device or format. I'm not sure if there is any feature set that would make this a good value proposition. I think a portable format (read: no DRM) will be critical for full market acceptance of ebooks.

yvanleterrible
06-14-2007, 12:19 PM
Er... no. I was able to convert many of my LPs to cassette, and then CD format. I only bought CDs where the quality of the music mattered enough to me that the higher fidelity of the CD offered enough extra value to justify the expense. And Apple has made it quite easy to convert CDs to play on the iPod.

In any case, the two aren't really comparable, because, apart from sheet music, there has never been a long-lived portable storage mechanism for music, so people accept limitations as we try to find one. The same is not true for books, which have been around for thousands of years. The "extra features" of ebooks (e.g. low weight/volume, searchability) have not yet provided enough value for people to be willing to accept having to replace them whenever the manufacturers come out with a new device or format. I'm not sure if there is any feature set that would make this a good value proposition. I think a portable format (read: no DRM) will be critical for full market acceptance of ebooks.
And cheaper reading devices if I may add. No one wants to spend long hours reading chained to a computer.

NatCh
06-14-2007, 12:45 PM
I think a portable format (read: no DRM) will be critical for full market acceptance of ebooks.I can, conceptually, at least, imagine a DRM approach that wouldn't hinder acceptance. It would have to be a 'standard' approach so that it could be used across multiple platforms, and would have to allow things like 3rd party xfrs and such. It would have to be pretty cleverly built though, and I don't think that's likely to happen: the DRM folks seem to be sinking all their 'cleverness' into prohibiting this sort of thing.

NOTE: I am not saying that I want DRM, only that I can imagine that there might be some approach that could be livable. Further, i agree in advance with all those who will tell me that this will never happen. I'm dreamin' here, okay? :nice:

JSWolf
06-14-2007, 12:56 PM
One problem as I see it is this.. Both myself and my wife have a Sony Reader. I can connect her reader to my computer and we can both read the same books purchased at the connect store. But, if I do that, the books she has from there cannot be read as long as she is active on my computer. Then we have to deactivate her from me and back to her computer and the books she has from my computer no longer work. It would be good if all the devices in one household would work with whatever books were purchased on whatever computer. That would make DRM a lot more useful. As it stands, DRM is a hinderence. I do not want to have to purchase seperate copies of books just to be able to read them on multiple devices. Basically, once my $50 credit is up, I'm purchasing books I can deal with the way I want. So Sony, all I can say is SCREW UP GUYS! (Thanks Eric Cartman for that line)

Is there a DRM that that does anyone any good? Let's try this scenario (not true, but not impossible). Let's say the Sony Connect store goes away. I have on my computer all the books I have purchased from there. my wife has all her books she's purchased. We have her reader registered on my computer so she can read some of my books. We then go back to her computer to register her reader there so she can read some of her books. But wait, the connect software might not be able to connect to the Connect store to authorize her reader (if that's how it works), what do we do with the books she has since she can no longer read them?

Mobipocket is registered via a name and credit card number. So you change your credit card number. And you purchase new books with your new credit card. Can you read the old books while the new books are registered? Do you have to keep switching info to read the old vs. the new?

DRM is more a hassle then it is worth. Give me one instance where DRM does ME any good in the long run.

Jon

NatCh
06-14-2007, 01:01 PM
It would be good if all the devices in one household would work with whatever books were purchased on whatever computer.If you had set both computers and Readers up on the same ConnStore account, you would be able to do just exactly that -- that's part of the reason they allow 6 devices on a single account, I believe. :wink2:

If you're interested in doing that after the fact, I know that others have had some success with getting them to combine two accounts (move all the books from one to the other, and close the empty one) when they made it clear that they wanted to delete the extra account. :nice:

JSWolf
06-14-2007, 01:04 PM
If you had set both computers and Readers up on the same ConnStore account, you would be able to do just exactly that -- that's part of the reason they allow 6 devices on a single account, I believe. :wink2:

If you're interested in doing that after the fact, I know that others have had some success with getting them to combine two accounts (move all the books from one to the other, and close the empty one) when they made it clear that they wanted to delete the extra account. :nice:
Dear, please get off your computer now, I want to buy a book. But I'm in the middle of reading the forum. You can do that after I spend hours and hours trying to find a book on the Connect site because the connect site is a POS and doesn't easily let you find anything you might actually want.

NatCh
06-14-2007, 01:14 PM
I know you don't like the ConnStore -- you're hardly alone in that -- I'm simply pointing out that you can do what you said you wanted to do (and if you do it, you can buy books from the POS ConnStore on the other computer, while the one is being used to read MobileRead). :grin:

JSWolf
06-14-2007, 05:03 PM
If someone has a palm and is using that to read purchased PDB format books, can those books be moved over to the iLiad.