View Full Version : No more 300k limit per flow on latest DE SDK ?


Hadrien
07-15-2009, 10:47 AM
I've seen several reports that the new SDK doesn't force a 300k limit per flow anymore. Currently on the homepage, the review of the Cool-er states that the limit doesn't seem to apply anymore, I've seen a tweet saying that new PRS-505 devices didn't have this limit and when I tested the Cybook Opus, Michael from Bookeen said that as far as he could remember, such a limit doesn't exist anymore.

Is there any official confirmation about this from Adobe (Peter ?) or someone working with their SDK ?

What about the older PRS-505 and PRS-700 ? Will Sony provide an upgrade to disable the limit and enable justified text ? It would be a very bad thing to keep the market fragmented: some books (such as Ulysses) need to have flows that are longer than 300k and while we should continue to divide e-books into multiple flows as much as possible, it would be a good thing to get rid of this limit.

HarryT
07-15-2009, 10:55 AM
A related question: if you use a tool such as Calibre to convert, say, a MobiPocket book to ePub, will it automatically split the file into 300k "flows"? The reason I ask is that I create all my books as one single HTML file, and it would be a great deal of work to have to manually split them up.

Hadrien
07-15-2009, 10:59 AM
A related question: if you use a tool such as Calibre to convert, say, a MobiPocket book to ePub, will it automatically split the file into 300k "flows"? The reason I ask is that I create all my books as one single HTML file, and it would be a great deal of work to have to manually split them up.

It will and Calibre will try to use the h1/h2/h3 tags as much as it can to do so.
For Feedbooks, our script is based on Xpath expressions to divide a source HTML into multiple flows (parts, chapters, sections and text flows). They're both quite similar from a user perspective, except that we don't force flows to be 300k long.

HarryT
07-15-2009, 11:03 AM
Thank you, Hadrien, that's good news.

If it is indeed true that the 300k flow limit has gone from devices which use the ADE SDK, the Sony devices could very soon find themselves the "orphans" of the ePub world, unable to read books which other ePub devices are able to handle.

Hadrien
07-15-2009, 11:04 AM
If it is indeed true that the 300k flow limit has gone from devices which use the ADE SDK, the Sony devices could very soon find themselves the "orphans" of the ePub world, unable to read books which other ePub devices are able to handle.

And justified text.
That's why Sony and Adobe should absolutely provide an update to avoid this whole mess.

kovidgoyal
07-15-2009, 12:22 PM
What needs to be checked is if the new SDK encounters a large flow does it use the full HTML+CSS rendereer or a simplefied renderer for speed. I doubt the Cooler has a much faster processor than the 700.

mtravellerh
07-15-2009, 12:50 PM
What needs to be checked is if the new SDK encounters a large flow does it use the full HTML+CSS rendereer or a simplefied renderer for speed. I doubt the Cooler has a much faster processor than the 700.

I think it was a memory problem more than a speed problem, honestly. All of the Gutenberg ePub files I have tried so far have worked on the Cool-ER, so far ( a bit yucky to go through them, but that's the price for scientific research for you!) No speed issues, either! Please throw some "problematic" ePubs my way and I will test them!

kovidgoyal
07-15-2009, 01:01 PM
According to Peter, it was a speed problem. Since he helped write the software, I'm inclined to take his word for it :)

If it was a memory problem, then there should still be a limit, it will just be larger.

It should be easy to generate a test epub using a simple script that outputs something like


<p>
line 1<br/>
line2<br/>
...
</p>


Then just convert it to epub using calibre (with splitting turned off --profile none)

mtravellerh
07-15-2009, 01:30 PM
Cool-ER has a Samsung S3C2440 ARM 400MHz Processor and 128 mb RAM. I think Sony 700 is about the same (400MHz Freescale and 128 mb, if I remember right) So that puts them on par hardwarewise.

mtravellerh
07-15-2009, 01:33 PM
According to Peter, it was a speed problem. Since he helped write the software, I'm inclined to take his word for it :)

Yes, I believe that. But I guess the problem should have been the 505, then! Isn't the ADE engine on the Sonys pretty much identical? ( the 505 features a staggering 200 MHz processor and 64 mb RAM)

kovidgoyal
07-15-2009, 01:53 PM
Cool-ER has a Samsung S3C2440 ARM 400MHz Processor and 128 mb RAM. I think Sony 700 is about the same (400MHz Freescale and 128 mb, if I remember right) So that puts them on par hardwarewise.

That would lead me to believe the SDK is swapping in a simpler renderer for large flows

Peter Sorotokin
07-15-2009, 02:15 PM
I've seen several reports that the new SDK doesn't force a 300k limit per flow anymore. Currently on the homepage, the review of the Cool-er states that the limit doesn't seem to apply anymore, I've seen a tweet saying that new PRS-505 devices didn't have this limit and when I tested the Cybook Opus, Michael from Bookeen said that as far as he could remember, such a limit doesn't exist anymore.

This limit is configured by individual manufacturers. It depends on the CPU speed and memory. Software improvements will also allow us to push it higher - but those did not filter down into devices yet. Going forward, I expect this limit won't be of much practical importance.

However, breaking content into chapters if content has logical breaking points is always going to work better - no matter how high the limit is.

Peter

Peter Sorotokin
07-15-2009, 02:21 PM
What needs to be checked is if the new SDK encounters a large flow does it use the full HTML+CSS rendereer or a simplefied renderer for speed. I doubt the Cooler has a much faster processor than the 700.

No, that stuff is not out yet. However, the strategy is going to be not a simpler engine, but insertion of forced page breaks.

kovidgoyal
07-15-2009, 02:43 PM
No, that stuff is not out yet. However, the strategy is going to be not a simpler engine, but insertion of forced page breaks.

Interesting, so rendering happens in page blocks?

Hadrien
07-15-2009, 02:50 PM
Interesting, so rendering happens in page blocks?

I think that what Peter meant is that when they reach a certain limit, they stop parsing the file and consider it finished (closing the XML tags), and then start parsing the rest of it like a new flow.

Hadrien
07-15-2009, 02:53 PM
This limit is configured by individual manufacturers. It depends on the CPU speed and memory. Software improvements will also allow us to push it higher - but those did not filter down into devices yet. Going forward, I expect this limit won't be of much practical importance.

However, breaking content into chapters if content has logical breaking points is always going to work better - no matter how high the limit is.

Peter

Sure, but the problem is when you don't have logical breaking points (like with Proust).

So basically, what you mean is that the other manufacturers decided to remove the limit or opted for a much higher limit ?
What I'd really like to know is if the support for either a higher limit or no limit at all will make it onto the PRS-505/700. Otherwise it'll force content providers to use 300k flows for legacy support.

Peter Sorotokin
07-15-2009, 05:32 PM
Interesting, so rendering happens in page blocks?

Th limit is there because we cannot paginate the content starting from the beginning of the chapter fast enough if someone navigates to the end of the chapter (e.g. through TOC). But if we have some known page breaks in the chapter we can start from those. If the chapter is too long, we just insert them heuristically (in future we may start paying attention to page-break CSS properties). In some sense it is the same startegy that converters have to do, but done on the device itself - and the artifacts are the same (artificial page breaks). Still, it's better than dropping CSS altogether IMHO.

Important consideration here is that XML parsing and CSS cascade can be done fast enough even for very large chapters (although we have not squeezed everything there by any means). Layout and rendering are much slower part. However rendering only needs to be done for a single page, so it is almost always layout which causes the most problems when navigating in the middle of the chapter. (Sequential reading performance considerations are different).

Peter Sorotokin
07-15-2009, 05:48 PM
Sure, but the problem is when you don't have logical breaking points (like with Proust).

I only keep repeating it so no one gets false impressions.

BTW, do you have those Proust EPUB files somewhere? It seems that everyone obeys 300k limit now and it is getting hard to find content which does not.

So basically, what you mean is that the other manufacturers decided to remove the limit or opted for a much higher limit ?
What I'd really like to know is if the support for either a higher limit or no limit at all will make it onto the PRS-505/700. Otherwise it'll force content providers to use 300k flows for legacy support.

I would check what they did by trying it. For instance, they may have removed it only to make such content unusable in other way (e.g. device freezing or crashing).

kovidgoyal
07-15-2009, 05:58 PM
Th limit is there because we cannot paginate the content starting from the beginning of the chapter fast enough if someone navigates to the end of the chapter (e.g. through TOC). But if we have some known page breaks in the chapter we can start from those. If the chapter is too long, we just insert them heuristically (in future we may start paying attention to page-break CSS properties). In some sense it is the same startegy that converters have to do, but done on the device itself - and the artifacts are the same (artificial page breaks). Still, it's better than dropping CSS altogether IMHO.

Important consideration here is that XML parsing and CSS cascade can be done fast enough even for very large chapters (although we have not squeezed everything there by any means). Layout and rendering are much slower part. However rendering only needs to be done for a single page, so it is almost always layout which causes the most problems when navigating in the middle of the chapter. (Sequential reading performance considerations are different).

Ah, makes sense, thanks.

Hadrien
07-15-2009, 06:05 PM
I only keep repeating it so no one gets false impressions.

BTW, do you have those Proust EPUB files somewhere? It seems that everyone obeys 300k limit now and it is getting hard to find content which does not.


Try Ulysses: http://www.feedbooks.com/book/1232.epub

Jellby
07-16-2009, 08:18 AM
Th limit is there because we cannot paginate the content starting from the beginning of the chapter fast enough if someone navigates to the end of the chapter (e.g. through TOC).

But in normal reading, one starts from the beginning and reads page by page until the end. Could that be done without the spurious page breaks? If the book is closed and then opened again to resume reading in the middle of a long "chapter", could it be done so that whatevere spurious pagebreaks needed appear only in the pages before the current position? The point is to provide a smooth reading experience, even if small inconsistencies are introduced when jumping back and forward.

Peter Sorotokin
07-16-2009, 09:52 PM
Try Ulysses: http://www.feedbooks.com/book/1232.epub

Aha, thanks. This is the type of the file that I was looking for. BTW, were extra line breaks inserted in the last "paragraph"?

Peter Sorotokin
07-16-2009, 09:58 PM
But in normal reading, one starts from the beginning and reads page by page until the end. Could that be done without the spurious page breaks? If the book is closed and then opened again to resume reading in the middle of a long "chapter", could it be done so that whatevere spurious pagebreaks needed appear only in the pages before the current position? The point is to provide a smooth reading experience, even if small inconsistencies are introduced when jumping back and forward.

Certainly it is possible to do it this way, but the logistics are getting just too complex for such an obscure feature. We need to break 300k limit, but there is no reason to do it in a fancy way. I'd rather do things like text/layout quality and hyphenation first.

Peter

Valloric
07-18-2009, 09:44 PM
I'd rather do things like text/layout quality and hyphenation first.

By all means, give those a higher priority.