![]() |
#31 | |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 554
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
|
Quote:
From reading your comments (including the ones later in the thread), it certainly looks like you know what you're talking about. Perhaps you should work at Adobe ![]() |
|
![]() |
![]() |
![]() |
#32 | |
sleepless reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,763
Karma: 615547
Join Date: Jan 2008
Location: Germany, near Stuttgart
Device: Sony PRS-505, PB 360° & 302, nook wi-fi, Kindle 3
|
Quote:
![]() Working for Adobe would be a No-Go. kovid and Harry could blame me then for all the little issues with Adobe Digital Editions. ![]() On the other hand Adobe's software architects and coders are usually really good. I don't know why ADE was implemented that poorly. |
|
![]() |
![]() |
Advert | |
|
![]() |
#33 |
Zealot
![]() ![]() ![]() ![]() Posts: 114
Karma: 325
Join Date: May 2009
Device: Cool-ER
|
As has been pointed out, the problem with the initial article is that it confuses issues with the respective formats with issues with the readers for those formats.
Netseeker is completely right that you don't need the entire stream in memory to be able to render something like an epub file. In fact you can go further than simply building the parse tree in memory - you can store it in an index file that lives alongside the document. Indexing the document need only be done once when it is first opened and could happen in the background as you're reading the first few pages. If you have a limited number of font choices (which is true for most e-readers) you could actually index every page within the document, complete with relevant style hints so that jumping to arbitrary points would always happen instantaneously. The penalty for such behaviour is a more complex parser and some storage overhead (which is hardly an issue when 2 gig flash cards are only a few dollars). Processor overhead really shouldn't be an issue - even on the oldest devices - but does require some understanding of real time systems to implement. Where files are transferred to the e-reader through a library application on the user's PC, the index files could even be generated at the same time, leaving the e-reader to do the bare minimum of work to display any arbitrary page. The issue here is that epub in particular (and anything XML-y in general) lends itself to 'lazy' implementations. On modern PC's there is very little penalty for just hacking at a file, so the workarounds for dealing with large datasets just aren't common knowledge. I've worked for clients who have managed to produce 500MB data files and only then wonder why it can take a while to process them. In general, a format like epub lends itself to transformation, so could be regarded as a 'transfer' format, which might be translated to a device specific variant that enables efficient rendering, storage and retrieval. Whilst there are pathological cases that can make parsing more complex, these can usually be transformed to simpler parse trees - and publishers should recognise that over-complex formatting benefits no-one. |
![]() |
![]() |
![]() |
#34 |
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 554
Karma: 2928497
Join Date: Mar 2008
Device: Clara 2E & Sage
|
I don't know the internal workings, but MS Reader appears to do something along these lines. Notice when you open a large LIT ebook that you can start reading right away, but the page counter at the bottom is still churning away.
|
![]() |
![]() |
![]() |
#35 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
You certainly need to store the entire XML tag structure in memory to support CSS 2.1, you don't need to store the text nodes and that will reduce memory consumption for files that have a low tag/text ratio. However, you still have to read and parse the entire tag tree. Whether you do it in a streamed fashion or not. The reason you have to read and parse the entire tree is to support CSS selectors. I suggest you read the following to understand just why it is neccessary http://www.w3.org/TR/CSS2/selector.h...dant-selectors
It's certainly true that you don't have to store all the text content in memory and that means the size limit of 300KB can probably be increased. But frankly, the increased programming complexity and consequent rendering fragility is not worth it. I think 300KB is a perfectly reasonable limit. EPUB creators simply have to keep it in mind. As for pre-parsing and storing rendered versions of the file, I think that is an extremely inelegant solution and imposes an absolute restriction on allowing display modification by the user. Say good bye to allowing free font resizing, line space and margin adjustments. |
![]() |
![]() |
Advert | |
|
![]() |
#36 |
Zealot
![]() ![]() ![]() ![]() Posts: 114
Karma: 325
Join Date: May 2009
Device: Cool-ER
|
I'm fully aware of CSS selectors thank you :-) I've been involved with web browsers - at the software level - for just shy of 15 years now. There's no reason why you can't stream through an XML document and flatten the selector space as you index it - and as the range of styles in a document is rarely that extensive the penalty for doing so is minimal.
Rendering fragility is a poor excuse - navigating the parse tree is a task that can be modularised and act as both a driver and restorer of index and style information. It's really not that difficult once you've got the framework in place. Any codebase that assumes or imposes an arbitrary size restriction on parsing a dataset that itself is unrestricted and shared with unknown third party software is bound to fail, however generous those restrictions may appear to be. Inelegant solutions are necessary where you are trying to provide the ideal user experience. Agreed that completely free font sizing, line spacing and margin adjustments present an insurmountable task. However, it's worth noticing that most devices don't offer unlimited options, and most users will only ever select a small subset of those. Indexing (not storing rendered versions - ugh!) need only be performed on the most recent and most preferred choices. It's only when the user is determined to cycle through every option whilst reading the last page in a document that the interface need degrade to worst case re-parsing. Notice that even then, the experience is no worse than current 'all in memory' solutions. With indexing for rendering being a multi-level process, even that worst case can be hurried along by removing the need for dealing with overly complex parse trees. |
![]() |
![]() |
![]() |
#37 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,550
Karma: 19500001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
True. But it is also true that it's not sensible to expect unlimited size support in all applications. The ePUB spec should probably have stated some minima for filesizes (and maybe nesting levels or number of styles) rendering software must support, so that a file can be guaranteed to work in conformant readers. Readers with more resources available could support larger limits, but there would be some minimum we could rely on.
|
![]() |
![]() |
![]() |
#38 | |||
sleepless reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,763
Karma: 615547
Join Date: Jan 2008
Location: Germany, near Stuttgart
Device: Sony PRS-505, PB 360° & 302, nook wi-fi, Kindle 3
|
Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#39 | ||
Zealot
![]() ![]() ![]() ![]() Posts: 114
Karma: 325
Join Date: May 2009
Device: Cool-ER
|
Quote:
Quote:
Certainly epub offers the developer many choices as to how they render the document. Nothing in the spec requires that any given section of the document have to be stored in memory at one go in order to be rendered. The only 'reasonable' restriction might be to say that a device with X megabytes of storage space should be able to render the largest single book that can fit on it's storage. How they go about doing that is up to the firmware developer. Again though, it's really no business of the standards body. Last edited by Tuna; 06-21-2009 at 11:57 AM. |
||
![]() |
![]() |
![]() |
#40 | |
Zealot
![]() ![]() ![]() ![]() Posts: 114
Karma: 325
Join Date: May 2009
Device: Cool-ER
|
Quote:
Page indexing be done separately and needn't be expensive - consider that you're talking about a few hundred indices for most books - so even if you allow for (say) a couple of dozen most likely combinations of font sizes, line spacing and margin settings your document index need not be more than a few kilobytes in size. That's hardly a high price in return for instant page turns and accurate next/previous page behaviour. |
|
![]() |
![]() |
![]() |
#41 | ||
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
Quote:
Umm imposing a size restriction is an inelegant solution, but one that works for end users, EPUB creators (provided they just keep it in mind) and the creators of EPUB document viewers. Instead you propose a system that offers a very slight benefit to EPUB document creators and very large overhead on the creators of EPUB rendering software for no benefit to EPUB end users. If people had to guarantee that EPUB renderers could render any size/complexity of XHTML on any device, the format would never have gotten off the ground. |
||
![]() |
![]() |
![]() |
#42 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
![]() |
![]() |
![]() |
#43 | |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
![]() |
![]() |
![]() |
#44 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
|
![]() |
![]() |
![]() |
#45 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,377
Karma: 27230406
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Shouldn't just the parent tree be enough for first child and +? Where parent tree actually means not just parents but all siblings that occur before a given element in document order as well. Perhaps pre-tree would be a better term.
|
![]() |
![]() |
![]() |
Tags |
epub, mobi |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Convert Epub and Msreader formats to Kindle formats | bruc79 | Calibre | 17 | 06-22-2010 04:50 AM |
Other formats than ePub or Zip? | Robotech_Master | Calibre | 4 | 05-28-2009 02:15 PM |
Converting epub to other formats | garygibsonsf | ePub | 6 | 05-06-2009 12:25 PM |
Formats for PRS-505 / Mobipocket | thorswitch | Sony Reader | 6 | 06-07-2008 08:43 PM |
Announcing: MOBI2IMP v9 will directly convert mobipocket .prc to .IMP formats | nrapallo | Kindle Formats | 4 | 03-22-2008 01:38 AM |