View Full Version : One last oeb2mobi test...


llasram
01-13-2009, 11:12 PM
Hi,

I want to give oeb2mobi output ooooone last test on real devices before I start trying to convince Kovid that it's ready to release on the world. If you have the time, interest, and a Mobi-supporting device, please grab the attached files and give them a try. Special notes and items of interest:


Because the 64k image size limit is a Palm-only limitation, all these files have unlimited image file sizes.
I've reduced the resolution of "full-page" images to something that would fit on the Sony Reader -- hopefully the Hanlin v3 based devices have similar margins? But the display size used for rasterization etc is tunable.
Only The Three Musketeers is compressed. It take a reeeeeally long time (like 6 minutes), so I don't see people doing it for books they convert for their own use.
The Mobipocket version of the calibre 'feeds2epub' "The Economist" has a HTML TOC generated from the calibre-generated NCX TOC. It's basic, but I hope sufficiently functional. I've attached the EPUB edition for comparison.
Mobipocket version of Jellby's edition of The Prince and the Pauper mostly to show off my handling of small-caps.


Fingers crossed! :-)

wallcraft
01-14-2009, 12:29 AM
I tested these using the Java MobiPocket Reader on Kindle, EZ Reader and iLiad. All these worked ok, although only the EZ Reader picked up all the extra guide items in The Prince and the Pauper (they all got the TOC, and all the TOCs were navigable).

Using FBReader on the iLiad and on the EZ Reader (OpenInkPot), the only problem was that FBReader does not detect the encoding and assumes it is Latin (Windows-1152). This causes some characters to display incorrectly, but this is a bug in FBReader which is fixed in the latest version.

Using the latest FBReader 0.10.0 under Windows, The Three Musketeers worked and correctly detected the Unicode encoding. The other two crashed FBReader (although they worked on the iLad and OpenInkPot with earlier versions of FBReader). Perhaps this was because of the lack of compression?

pdurrant
01-14-2009, 04:43 AM
I've tested with my CyBook Gen3 64MB firmware 1.2 build 796

The Prince and the Pauper looks great. I like the cover image size (600x800) which fills the screen. The initial caps/smallcaps at the start of chapters works nicely. (Only at the very smallest display font size do the small caps display larger than they should)

The superscript on the footnote links causes the line they are in to have extra leading. I took a look at your HTML, and I can't see a way around that.

The "Goto" list is a bit odd: First Page, Start Reading, Begin Reading, Notes, Illustrations, Frontispiece, Contents, Title Page, Last Page

The cover image isn't found by Stanza Desktop, but this might be because it's bigger than 63KiB, or it might just be a bug in Stanza.


The Economist is very good. Certainly working OK and readable. The captions sometimes overlay the next line of text - this seems to be because of the tables used for the layout on the website. The first line indent of 3em seems excessive
The cover image is only 590x750, but I guess that's Calibre.

The Three Musketeers also looks very good. The small caps works. The contents is good. Apart from the cover size (not your doing) I can't fault it.

Excellent work!

Jellby
01-14-2009, 04:52 AM
Mobipocket version of Jellby's edition of The Prince and the Pauper mostly to show off my handling of small-caps.

I'm not at home now and cannot check it, but did you compare it with my mobipocket version? I could give the source HTML of it.

The superscript on the footnote links causes the line they are in to have extra leading. I took a look at your HTML, and I can't see a way around that.

Strange... I've seen that in browsers, but when I use superscripts (just with <sup></sup>) in mobipocket the leading does not seem to be affected in the Cybook.

tompe
01-14-2009, 08:25 AM
The Three Musketeers worked OK on my S60 phone. Jumping to external links works.

I think a title page is missing in the book. Goto first page will jump to the first text page and I think you should be able to see the title and author there.

llasram
01-14-2009, 11:04 AM
I tested these using the Java MobiPocket Reader on Kindle, EZ Reader and iLiad. All these worked ok, although only the EZ Reader picked up all the extra guide items in The Prince and the Pauper (they all got the TOC, and all the TOCs were navigable).

My code only ensures / generates if necessary a TOC entry, but good to know.

Using the latest FBReader 0.10.0 under Windows, The Three Musketeers worked and correctly detected the Unicode encoding. The other two crashed FBReader (although they worked on the iLad and OpenInkPot with earlier versions of FBReader). Perhaps this was because of the lack of compression?

Weird. I'll see if I can figure out the problem this evening.

The superscript on the footnote links causes the line they are in to have extra leading. I took a look at your HTML, and I can't see a way around that.

If as Jellby says this doesn't usually happen on the Cybook it may be because I'm generating both <sup/> and <font/> tags. As it makes more sense anyway to do just <sup/> (as <font/> within it is ignored) I'll change my code before the merge.

The "Goto" list is a bit odd: First Page, Start Reading, Begin Reading, Notes, Illustrations, Frontispiece, Contents, Title Page, Last Page

The list itself or the order? I'm not imposing an order on the <guide/> entries, but it appears that the Mobipocket Readers aren't either, so it's whatever order Python puts them in. I'll modify them to go in the the order they appear in the OPF spec. The "Start Reading" link is kind of weird -- I don't think that one is in the file itself, nor did I see it in Mobipocket Desktop. Where does it link to?

The Economist is very good. Certainly working OK and readable. The captions sometimes overlay the next line of text - this seems to be because of the tables used for the layout on the website. The first line indent of 3em seems excessive

Yeah... The 3em thing is kind of weird. Calibre generates CSS with a 2em indent (which is still kind of big IMHO), but then the any2epub font-rescaling code seems to leave the markup with a large font-size on all the block-level elements and all the in-block text within <span/>s which reduce the font size. Definitely a fix there to be had, but rest assured that the Mobi conversion is working properly.

The cover image is only 590x750, but I guess that's Calibre.

The Three Musketeers also looks very good. The small caps works. The contents is good. Apart from the cover size (not your doing) I can't fault it.

Actually the cover size is me, I think. I reduced the default screen size used for "full-page" SVG rasterization, which affected all the cover images. It is tunable (through a "renderer profile" option), and as less than 600x800 degrades on the CybookG3 (the only Mobi-native device Calibre has device support for) I'll switch the default back to that. Hanlin users can set their render profile.

I think a title page is missing in the book. Goto first page will jump to the first text page and I think you should be able to see the title and author there.

Ah, that's a "book bug" rather than a converter bug -- my edition doesn't have a separate title page, which I agree wouldn't be a bad idea.

Anyway, thanks for the help, and hopefully this will be in your hands soon! :)

-Marshall

llasram
01-14-2009, 11:18 AM
I'm not at home now and cannot check it, but did you compare it with my mobipocket version? I could give the source HTML of it.

I did, and meant to mention that. They're very close. The only differences I noticed:


oeb2mobi is converting the small-caps to faux small-caps, so those are preserved (formatting-wise) in the Mobi version.
Some of the vertical spacing is a bit different. I meant to check your in-Mobi HTML -- are you using the page-size-relative '%' units? Right now oeb2mobi is translating everything into 'em's based on a particular renderer profile. Fixing that will be a fairly large change to how I do basic CSS value handling, so I'm holding off on it for a later release.
oeb2mobi places OPF //spine/itemref[@linear="no"] items at the end of the book and separated by "uncrossable" boundaries. So some things like the TOC and list of illustrations are removed from the linear reading order of the main text.


Other than that, the books look very much alike. The main difference is under the hood in the markup content itself. Because Mobipocket markup is so limited, I simplified the task of conversion by treating it as a formatting-only language. No markup semantics or block nesting are preserved -- everything is basically converted into a sequence of <p/> tags containing <font/> tags.

Jellby
01-14-2009, 11:43 AM
I did, and meant to mention that. They're very close. The only differences I noticed: ...

Well, I should mention that I did not create the two versions from the same source. Actually, I first created the mobi version with the suported mobi markup, and then converted this to XHTML, adding the appropriate tags and CSS, so there are surely differences in spacing, etc. In the mobi version I think I specify all spacings in ems.

The "Start Reading" link is kind of weird -- I don't think that one is in the file itself, nor did I see it in Mobipocket Desktop. Where does it link to?

In my tests with the Cybook, I think I discovered how it works. The Cybook adds a guide item called "Start Reading" (which is translated when you change the UI language, just like "Last Page", etc.), this item seems to point to the the target of the first internal link. Normally, you'd have a TOC at the beginning, and "Start Reading" will point to the first entry in the TOC. But if you want it to point somewhere else, it is possible to add an empty link before the first entry thus: "<A HREF="#start_reading_target"></A>". The Cybook does not render this link, and when pressing "down" it does not cycle through this link, but it makes "Start Reading" point there.

And of course, if you add something called "Start Reading" or "Beginning" or whatever to the guide, it will be duplicated, there will still be the special "Start Reading" link (and the one added by you won't be translated to different languages).

tompe
01-14-2009, 12:57 PM
Ah, that's a "book bug" rather than a converter bug -- my edition doesn't have a separate title page, which I agree wouldn't be a bad idea.


My idea was that you could always generate a title page from the meta data. It is better to have two "title" pages then none.

llasram
01-14-2009, 07:02 PM
In my tests with the Cybook, I think I discovered how it works. The Cybook adds a guide item called "Start Reading" (which is translated when you change the UI language, just like "Last Page", etc.), this item seems to point to the the target of the first internal link. Normally, you'd have a TOC at the beginning, and "Start Reading" will point to the first entry in the TOC. But if you want it to point somewhere else, it is possible to add an empty link before the first entry thus: "<A HREF="#start_reading_target"></A>". The Cybook does not render this link, and when pressing "down" it does not cycle through this link, but it makes "Start Reading" point there.

That is... Weird. Thanks for the info, even though I'm not sure if there's anything useful oeb2mobi can do with that.

llasram
01-14-2009, 07:35 PM
Using the latest FBReader 0.10.0 under Windows, The Three Musketeers worked and correctly detected the Unicode encoding. The other two crashed FBReader (although they worked on the iLad and OpenInkPot with earlier versions of FBReader). Perhaps this was because of the lack of compression?

They work fine under the Linux version of FBReader 0.10.0, so no ideas, but also probably not a big problem.

Jellby
01-15-2009, 04:10 AM
That is... Weird. Thanks for the info, even though I'm not sure if there's anything useful oeb2mobi can do with that.

It could:

a) Not add an additional "Begin Reading" guide item.

b) Put an empty link to the desired "Start Reading" target at the beginning of the book.

I also put a guide item with type="other.ms-firstpage", but no title. It does not appear in the Cybook's "Go to" menu, but I'm afraid it won't appear in other readers either...

llasram
01-15-2009, 11:08 AM
It could:

a) Not add an additional "Begin Reading" guide item.

b) Put an empty link to the desired "Start Reading" target at the beginning of the book.

I also put a guide item with type="other.ms-firstpage", but no title. It does not appear in the Cybook's "Go to" menu, but I'm afraid it won't appear in other readers either...

The "Begin Reading" //guide/reference is actually the [@type="text"] one you have in your OPF -- "Begin Reading" is just the default title I have for it. But you are right -- I can just rename that title to something less confusing like "Main Text" and create a link which causes the Cybook to pick up "Start Reading" as a duplicate of the //reference[@type="text"].

The "other."-prefixed references without titles I'm actually dropping in the conversion code. I didn't think of just including them without titles -- if Mobipocket Desktop just ignores them then that's probably the most sensible thing to do.

Jellby
01-15-2009, 11:20 AM
The "Begin Reading" //guide/reference is actually the [@type="text"] one you have in your OPF -- "Begin Reading" is just the default title I have for it. But you are right -- I can just rename that title to something less confusing like "Main Text" and create a link which causes the Cybook to pick up "Start Reading" as a duplicate of the //reference[@type="text"].

The "other."-prefixed references without titles I'm actually dropping in the conversion code. I didn't think of just including them without titles -- if Mobipocket Desktop just ignores them then that's probably the most sensible thing to do.

Ah, OK... I forgot there was a "guide" in ePUB too :D (I was talking about the guide in the mobi version). Since I don't have a real ePUB reader (and I don't know if there is any one that supports the guide), there are some things I just added without being quite sure how they are supposed to work.

As for the "text" item in the ePUB guide (and "other.ms-firstpage" in the mobi), I added them without titles in the hope that some readins software would use them in a way similar to the Cybook's "Start Reading", but I didn't really wanted to have some link with a fixed name there. You can of course do as you like with them, this is just an explanation of why I do that :)