In response to some discussion in
this thread, I downloaded the same free epub from as many ebook sellers as possible and looked at the differences inside.
I downloaded
A Perilous Proposal by Michael Phillips from
Barnes & Noble,
eBooks.com,
Kobo (both ADE and kepub),
Christianbook,
Google Play and
Amazon.
The stores that I'm aware of but
didn't try are:
- iTunes: I don't have iTunes installed and if I did, I can't remove iTunes DRM.
- Books-A-Million doesn't offer any free ebooks as far as I can tell.
- Lifeway.com has its own reader with DRM that I can't remove.
I downloaded all of the books into Calibre using "The Tools" and then unzipped each epub to compare contents. My comparison tool of choice is
Beyond Compare, which will compare entire directories and even images.
The ADE-protected epubs downloaded from
Kobo and
Christianbook.com were nearly identical and so I'm assuming that they most closely match the epub uploaded by the publisher. The only differences I could find were Adept resource strings from the ADE system. My caveat about Kobo is that the ADE book is occasionally the kepub version described below and there's no way to tell beforehand if that's what you'll get.
The copy from
eBooks.com had slightly different metadata (the title was "Perilous Proposal, A" and the author was "Phillips, Michael"). The files had DOS-style line endings (CR-LF) while the others were UNIX-style (LF). Other than that, this copy was the same as the previous two. As an extra bonus, if the publisher offers a PDF, then it's usually included as a separate download. Unfortunately, most publishers don't sell PDFs anymore, so fewer and fewer books have them available (this one didn't).
The epub from
Barnes & Noble makes minor changes to the stylesheet, changing "text-align:left" to "text-align:justify" and changing some margins. B&N also apparently does some filtering on the character sets used in the book. Em dashes ("—") are changed to the HTML code "—".
The copy from
Google Play had extra anchors in the XHTML, presumably for the Google Books software. Google also did some similar filtering to B&N, but Google's is broken and the em dashes are just gone. Here is the same paragraph, first from the Kobo book:
Quote:
<p>“That was a small detail of rebs. They’d have killed you sure if I hadn’t come along. But they’re gone now. You don’t have to worry about them no more. I was out ahead of our company. That’s what I do—I’m a scout. And I tend the horses. I was scouting when I ran into them. Lucky for you I did too. In case you hadn’t noticed, they were wearing the <a id="page_60"/>grey of the Confederate rebels. We’re wearing the blue of the U.S. infantry. So are you.”</p>
|
now from Google Play:
Quote:
<p>“That was a small detail of rebs. They’d have<a id="GBS.0049.02"/> killed you sure if I hadn’t come along. But they’re gone now. You don’t have to worry about them no more. I was out ahead of our company. That’s what I doI’m a scout. And I tend the horses. I was scouting when I ran into them. Lucky for you I did too. In case you hadn’t noticed, they were wearing the <a id="page_60"/>grey of the Confederate rebels. We’re wearing the blue of the U.S. infantry. So are you.”</p>
|
It's also clear that Google re-encodes the images with more JPEG compression than the original files. It's probably not a big deal, but JPEG uses lossy compression, so the Google images may be slightly degraded. The file sizes on the original images:
89709 Titlepage.jpg
9843 common.jpg
9264 common1.jpg
213011 cover.jpg
and the Google epub:
67503 Titlepage.jpg
1558 common.jpg
1147 common1.jpg
119889 cover.jpg
An extra bonus with Google Play is that some books (including this one) include a PDF. This is tempered (for me, anyway) by Google's conversion of all PDFs into page images, even if the PDF started out having been typeset (like this book obviously was).
The prize pig is the
kepub. Here's the same paragraph we looked at before:
Quote:
<p xmlns="http://www.w3.org/1999/xhtml"><span id="kobo.40.1">“That was a small detail of rebs. </span><span id="kobo.40.2">They’d have killed you sure if I hadn’t come along. </span><span id="kobo.40.3">But they’re gone now. </span><span id="kobo.40.4">You don’t have to worry about them no more. </span><span id="kobo.40.5">I was out ahead of our company. </span><span id="kobo.40.6">That’s what I do—I’m a scout. </span><span id="kobo.40.7">And I tend the horses. </span><span id="kobo.40.8">I was scouting when I ran into them. </span><span id="kobo.40.9">Lucky for you I did too. </span><span id="kobo.40.10">In case you hadn’t noticed, they were wearing the </span><a id="page_60"></a><span id="kobo.41.1">grey of the Confederate rebels. </span><span id="kobo.41.2">We’re wearing the blue of the U.S. </span><span id="kobo.41.3">infantry. </span><span id="kobo.41.4">So are you.”</span></p>
|
There's a Calibre plugin that does a good job of cleaning that stuff up into a regular epub, or for those of opposite bent, there's one to add it to regular epubs and make kepubs.
Finally, there's
Amazon. While the Kindle book starts life as an AZW3 file, the
Kindle Unpack plugin will retrieve an epub that's very close to the source that was used to create it. If the original was a valid epub (which if the book is from a major publisher, it usually is), then you'll get a valid epub out. The only difference I noticed in the text is an em dash filter again (which works). Additionally, the images are compressed like Google's and the internal file names aren't preserved in the epub->Kindle conversion. For you folks that buy Kindle books and convert them to epub with Calibre, you may want to try Kindle Unpack first to see what you get.