View Single Post
Old 02-18-2010, 10:37 AM   #73
cartesius
Enthusiast
cartesius has learned how to buy an e-book online
 
Posts: 30
Karma: 80
Join Date: Jan 2009
Device: Iliad
Quote:
Originally Posted by Xenophon View Post
Reflow fails for PDFs (even fiction!) where the first letter of each chapter is a fancy drop cap. You get all of the text of the chapter except the first letter, followed by the drop-cap at the end.

Reflow as I have seen it on my Sony Reader, at least, insists on breaking at each page-end in the pdf. So what I get is 1.3 pages, .7 of a blank page, etc. This wastes 1/3 of my battery while I'm reading. And I read a lot, so this means recharging days sooner than if it didn't do this.

Reflow fails utterly for any non-fiction that involves tables, maps, images, etc. Think history, economics and public policy, popular science books, cookbooks(!!) and many other examples. All non-professional reading, too.

Please, at a bare minimum, give us at least automatic zoom-away-the-white-space-margins.

Xenophon
You're unhappy with your reader; but before you smash it against the wall ...read this.

Yes, reflowing a pdf page fails in many cases. But you must remember the pdf format was never designed for that; it is almost as difficult a task as reflowing the text on a piece of paper (minus the character recognition). One has to identify each character's proper place, then figure out which word does it belong to and only THEN proceed to reflowing.
Can we build a system that will perform great (drawings, tables, columns, reflow around a picture)? Yes! We'll employ self-learning, adaptive software. Will anything short of a 3 GHz with 4Gb of RAM system perform the operation in a reasonable amount of time? No...
The software that'll run on a less powerful CPU will perform text placement identification based on some guesses made by the software designers - we like to call that heuristics; the better the guess the better the result, but it's always guessing and it will fail in some particular cases.
Your pdf reflow e-reader is based on the Adobe kit, which I'll expect to have similar behavior as the Adobe Reader.

Your drop-cap letter : I doubt it'll go at the end of the chapter but most likely at the end of the page. The same will happen if the first word of the page has larger font that the rest, or there's a drop-cap somewhere else. Why? Because the characters order in the file is not necessarily the order you see on display: plenty pdf-generators will place that drop-cap letter at the end of the page (why? because this way you get to change the font less often - and font changing is a very time consuming operation); to circumvent this one must first extract the text, then sort it according to the original position. Why hasn't Adobe done it properly it's beyond me, since it's easy enough (pViewer does it). Relying on character order within the file to determine position is an enormously bad idea. (They also fail when there is no space char between words..again betting there is a space in, is a bad idea.)

The page break issue: probably Adobe didn't care so much about the devices with a fixed viewport size, such as e-book readers. Their reader simply uses the whole window to display as much as it can and since there is no need to 'flip' the page one cannot realize it's page based. Should you care about this? No. You have right to ask for better. But, please, have some patience, this feature is still in its infancy. There are also some usability challenges posed by this need to avoid 'page waste' but that's a different story.
Also: I don't believe you waste 1/3 of the battery life; think of it this way: without reflow you couldn't have read that page anyway while using the reader.

To my knowledge pictures are reflow friendly: they're simply zoomed. Tables are tough. Take this homework: you have a table that takes the whole width of the original page; please describe how would you like to see it reflowed.
Vector graphics are even tougher: shape recognition would be needed; and we get back to the 3 GHz 4Gb RAM system.

Now that you know more, I hope your reader will survive

that was my 2 cent,
c.
cartesius is offline   Reply With Quote