View Full Version : ePUB with PDF embedded test


Jellby
09-01-2009, 10:57 AM
As I pointed out in this post (http://www.mobileread.com/forums/showthread.php?p=574809), the ePUB spec allows including PDF documents as alternative renderings for chapters.

This is a test for this feature. It is my version of "Gulliver's Travels", with a PDF version of each chapter (at 912cm), and the standard XHTML as a "fallback". According to the spec, a reading system that supports PDF could show the PDF version, while those that don't will display the normal (XHTML) text. I don't know, however, if there is any system that supports this, but it may be worth testing.

The first thing I notice is that the book becomes huge. The normal ePUB version is 500 kB, the PDF version is 1.8 MB, and this "hybrid" version is 14MB! This is size increase is due to splitting the PDF in one file per chapter. I could include the whole book as a single PDF, but then I'd have to put all the XHTML text in a single file too (the alternatives are defined on a per-file basis).

So, feel free to try this in your readers (PC software or portable) and report any success or problem :)

PS. The technique is very easy, just check the "content.opf" file. In the manifest use:

<item id="Preface-1_pdf" fallback="Preface-1" href="pdf/Preface-1.pdf" media-type="application/pdf" />
<item id="Preface-1" href="Preface-1.xhtml" media-type="application/xhtml+xml" />

and in the spine:

<itemref idref="Preface-1_pdf" />

ahi
09-01-2009, 11:09 AM
As I pointed out in this post (http://www.mobileread.com/forums/showthread.php?p=574809), the ePUB spec allows including PDF documents as alternative renderings for chapters.

This is a test for this feature. It is my version of "Gulliver's Travels", with a PDF version of each chapter (at 912cm), and the standard XHTML as a "fallback". According to the spec, a reading system that supports PDF could show the PDF version, while those that don't will display the normal (XHTML) text. I don't know, however, if there is any system that supports this, but it may be worth testing.

Does it work for you, Jellby? I won't be able to check until later today.

And yeah... the splitting of PDF files is unfortunate. Though presumably the necessity to "physically" break up ePub's HTML components will decrease as processing power and memory in eBook reading devices increases.

Edit: Is there a way to give PDF primacy over the XHTML? i.e.: Try to suggest to the viewer to display PDF, and consider the XHTML as the alternate? If so, these sort of multi-layout editions could be manually tweaked by those wishing to view in the alternate format. Not terribly elegant--but a start.

- Ahi

Jellby
09-01-2009, 11:29 AM
Does it work for you, Jellby? I won't be able to check until later today.

I have no ePUB reader :D

My CyBook (Gen3) does not support ePUB (yet, and I'm not sure if I'll install the ePUB or Mobi firmware when it comes). I have ADE installed in another computer with Windows, but I don't feel like turning it on and fighting with the sloppy Windows network or looking for a pendrive...

Edit: Is there a way to give PDF primacy over the XHTML? i.e.: Try to suggest to the viewer to display PDF, and consider the XHTML as the alternate? If so, these sort of multi-layout editions could be manually tweaked by those wishing to view in the alternate format. Not terribly elegant--but a start.

Actually, as it is done PDF is the preferred format. The XHTML version is like a "last resort" option, it would be used only if the PDF cannot be displayed (because it's not supported). Every file that's not in one of the basic file types (XHTML, PNG, SVG, JPG) must have a fallback file in one of those types defined. Hopefully, reading systems in the future will let the user select which source to use when there are different alternatives available, but I doubt it's the case now, especially for ADE, which doesn't let you change even the margins or font! :angry:

Abecedary
09-01-2009, 02:36 PM
Alright. Here's my brief report on this. The file opens fine in both ADE and Sony eBook Library. However, the PDF pages display at about 1/4 size and in the upper left corner of the window. Loading the file onto my 505 via USB results in an unopenable file. The title/author metadata is being read, but otherwise it thinks it's a 1 page file with no TOC. Attempts to open it throw up a 'Page Error!' message (took about a minute to get to that message the first time I tried opening it).

A cursory look at the guts of the file show that each PDF has about 30 fonts subsetted, which likely explains the relatively large size of each file. Unfortunately, it doesn't look like any fonts can be unembedded using Acrobat Pro. I was hoping that if the PDF didn't have any fonts embedded in it that it might try to use the reader's available fonts via the res:///Data/ path (similar to Acrobat's Use System Fonts switch). Of course, had this worked it would then bring up the question of how to get users to install the correct fonts on their readers...But whatever, it didn't work.

I'll fiddle around with it a bit more and let you know if I see anything else.

ahi
09-01-2009, 04:46 PM
How about trying to create a single chapter document with a clean, lean, LaTeX generated PDF file. Use this (http://www.paxlibrorum.com/res/downloads/taowde_6in_10pt.pdf), if nothing better is available.

Jellby
09-02-2009, 05:10 AM
A cursory look at the guts of the file show that each PDF has about 30 fonts subsetted, which likely explains the relatively large size of each file. Unfortunately, it doesn't look like any fonts can be unembedded using Acrobat Pro. I was hoping that if the PDF didn't have any fonts embedded in it that it might try to use the reader's available fonts via the res:///Data/ path (similar to Acrobat's Use System Fonts switch). Of course, had this worked it would then bring up the question of how to get users to install the correct fonts on their readers...But whatever, it didn't work.

Yes, I forgot about it... That's due to the microtype LaTeX package. Text lines are slightly expanded or compressed to make the text/whitespace ratio look more uniform, that's achieved by using expanded or compressed versions of the same font. I could disable this feature, but then the typographic quality wouldn't be so good, and the reason for using PDF wouldn't be so strong ;)

But I don't think that should prevent the Sony reader to open the PDF, this PDF (http://www.mobileread.com/forums/showthread.php?t=48983) uses the same "trick", and as far as I know opens fine everywhere.

Jellby
09-02-2009, 05:24 AM
Alright. Here's my brief report on this. The file opens fine in both ADE and Sony eBook Library. However, the PDF pages display at about 1/4 size and in the upper left corner of the window.

I have just tried it now in ADE with Wine, and there's not only the size problem, but it automatically trims the margins, as can be seen in pages 29-32. But at least text search and selection seem to work.

Abecedary
09-02-2009, 06:17 AM
I have just tried it now in ADE with Wine, and there's not only the size problem, but it automatically trims the margins, as can be seen in pages 29-32. But at least text search and selection seem to work.

I meant to mention that, too. IIRC, it wasn't doing that in the Sony desktop software.

Jellby
09-02-2009, 12:59 PM
However, the PDF pages display at about 1/4 size and in the upper left corner of the window.

I wonder if that's due to the absolute page size of the PDF. It's generated for a 912cm page, would it be displayed larger if the page were 1824cm? :chinscratch:

Abecedary
09-02-2009, 01:45 PM
I wonder if that's due to the absolute page size of the PDF. It's generated for a 912cm page, would it be displayed larger if the page were 1824cm? :chinscratch:
No idea, but a similar thought went through my head when I was looking over the PDFs yesterday. Only one way to find out.

ahi
09-02-2009, 01:57 PM
No idea, but a similar thought went through my head when I was looking over the PDFs yesterday. Only one way to find out.

We're going to build a giant wooden horse? :D

I'm saddened that it doesn't seem to work at all on the Sony PRS-505. :(

- Ahi

Jellby
09-02-2009, 03:28 PM
Did anyone try on the Opus?

Abecedary
09-02-2009, 03:33 PM
We're going to build a giant wooden horse? :D
Exactly! And the genius of my plan is that we're going to build it around us.

ahi
09-02-2009, 03:35 PM
Exactly! And the genius of my plan is that we're going to build it around us.

Sir! I have a doubt!

How will this giant wooden horse impact typographic quality?

- Ahi

Jellby
09-02-2009, 03:59 PM
How will this giant wooden horse impact typographic quality?

If everything goes as planned, there will be lots of orphans and widows :D

ahi
09-02-2009, 04:10 PM
If everything goes as planned, there will be lots of orphans and widows :D

It's hard to avoid them with such a small screen... :2thumbsup

HarryT
09-03-2009, 04:46 AM
Did anyone try on the Opus?

Just tried it. The title page, table of contents, etc, are all fine, but when it comes to the actual text, the page is displayed as a tiny block in the upper left corner of the screen. It can, in fact, be read, but not very easily :).

Jellby
09-03-2009, 07:23 AM
I made a quick test and yes, the size at which the PDFs inside and ePUB are displayed by ADE is affected by the page size of the PDF. With a A4-size PDF, the pages are just cropped to the window size, no scaling down, no scrolling... I guess it would be the same on the Opus or other devices. This makes this "trick" currently unusable.

HarryT
09-03-2009, 07:42 AM
Perhaps one might sum it up by concluding that "PDF is not an eBook format", do you think? :)

Interesting experiment, though. Thanks for trying it!

ahi
09-03-2009, 07:48 AM
Perhaps one might sum it up by concluding that "PDF is not an eBook format", do you think? :)

Interesting experiment, though. Thanks for trying it!

I'm sure most people here would--sign of the times and all.

- Ahi

Abecedary
09-03-2009, 07:51 AM
This makes this "trick" currently unusable.
I think there's a possibility this could still work with some minor manipulations to the PDFs. I should have a slow day at work today, so I'll get a chance to play around with it a little more.

Jellby
09-03-2009, 08:02 AM
I think there's a possibility this could still work with some minor manipulations to the PDFs. I should have a slow day at work today, so I'll get a chance to play around with it a little more.

Sure, you can make it work for a particular device (screen size), if that's what you mean, but it would be unusable for other sizes, or when you change orientation. Of course, PDF is designed for a particular screen size, but it should at least be resized for other sizes, not cropped!

ahi
09-03-2009, 08:45 AM
I think there's a possibility this could still work with some minor manipulations to the PDFs. I should have a slow day at work today, so I'll get a chance to play around with it a little more.

I guess, unsprisingly like with HTML, it seems to be all about hoping that various commercial entities deigning to support a large enough subset of the specification.

- Ahi

Jellby
09-03-2009, 09:07 AM
I guess, unsprisingly like with HTML, it seems to be all about hoping that various commercial entities deigning to support a large enough subset of the specification.

... and like PDF. Not every PDF viewer supports line-art smoothing, font hinting, transparency, bookmarks and hyperlinks, annotations, javascript... ;)

ahi
09-03-2009, 09:36 AM
... and like PDF. Not every PDF viewer supports line-art smoothing, font hinting, transparency, bookmarks and hyperlinks, annotations, javascript... ;)

Doubtless that is true. Though I'm yet to encounter a problem as such... are you yourself often plagued by them?

My inconveniences from shoddy and selective HTML implementation are rather more tangible.

- Ahi

Jellby
09-03-2009, 10:17 AM
Doubtless that is true. Though I'm yet to encounter a problem as such... are you yourself often plagued by them?

Not often in "normal" ebooks, that's true, but I use Adobe reader when possible, which is the implementation that better supports most of those things. Nevertheless, I miss some features like support for Compact PDF (http://multivalent.sourceforge.net/Research/CompactPDF.html) (off-spec, I'm afraid), which could be useful for the case of this thread.

EDIT: I just tried, and with Compact PDF the PDF files went down from 14 MB to 2.6 MB...