PDF is not an eBook format - Page 38

frabjous · 09-08-2009, 12:25 PM

Quote:

Originally Posted by ahi

The plausibility of hyphenation software being as good as a human being was discussed already. After some superhuman amount of distributed effort (presumably going on for years, if not decades) it might start to approach.

We were discussing what LaTeX could or couldn't "find". I had in mind was something that knew, for every word in a book, where the syllable breaks are, and so knew where it was even appropriate to hyphenate. I won't speak for other languages, but for English, this is not at all infeasible or even far from reality. (In fact, I'll go further: it is reality for LaTeX, so long as a book maker manually sets it for words LaTeX doesn't know, which in most cases will be few and finite.) All the things that have been said against it are just mistaken, as far as I can tell.

Once you have that, you have a program that can "find" all the possible patterns for a paragraph. The next, bigger, hurdle is choosing between these patterns. If you mean something that can find the "ideal" hyphenation patterns for an entire paragraph, I think LaTeX does that reasonably well when it has a decent line length to work with. As good as a human? I know of no reason to think any but the very best human typographers can do better. Getting something that will do as well as the very best typographers may be a long way off, but I haven't really heard--at least for English--any argument to the effect that it isn't practically possible, even within a few years.

Quote:

In the meantime, with an average-sized book, an hour or two worth of human attention will continue to achieve the same thing. We're definitely desperately looking for suboptimal solutions to already solved problems.

Well, you don't seem to have any desire for reflowable format. I think you'll have to admit that many other people do. That's the unsolved problem.

Quote:

I will be demanding a fixed layout and (at least some) manual attention. Given how little this actually requires from the publisher... I'm not sure why others see it as impractically onerous.

I agree with this: there is no reason why publishers couldn't offer a fixed format, or 3 or 5, for each book they sell, and I think they should continue to do so until a much better unfixed-format renderer is found or used.

ahi · 09-08-2009, 03:54 PM

Quote:

Originally Posted by frabjous

We were discussing what LaTeX could or couldn't "find". I had in mind was something that knew, for every word in a book, where the syllable breaks are, and so knew where it was even appropriate to hyphenate. I won't speak for other languages, but for English, this is not at all infeasible or even far from reality. (In fact, I'll go further: it is reality for LaTeX, so long as a book maker manually sets it for words LaTeX doesn't know, which in most cases will be few and finite.) All the things that have been said against it are just mistaken, as far as I can tell.

LaTeX doesn't necessarily know what words it doesn't know how to hyphenate properly. The LaTeX hyphenation patterns are just that--patterns, not lists of specific words. This means that after LaTeX's hyphenation, you are left with potentially up to four categories of words:

1. Those that LaTeX knows how to hyphenate correctly.
2. (ERROR) Those that LaTeX thinks it knows how to hyphenate, but--the word being an exception to whatever LaTeX hyphenation pattern matches is--in fact hyphenates incorrectly.
3. Those that LaTeX has no hyphenation patterns for, and rightly so, because the word should not be hyphenated.
4. (ERROR) Those that LaTeX has no hyphenation patterns for, but which should be hyphenated.

The "traditional" approach is to not worry about all this nonsense, and proofread the book to catch #2's, and manually fix badboxes to catch #4's. This is particularly sensible, since LaTeX has no way whatsoever to autodetect #2 type hyphenation errors, and has no way guaranteed correct way of separating #4's from #3's. Not to mention that the number of words needing to be fixed are likely to be fewer than an exhaustive list of hyphenation errors and unhyphenatable words.

Basically, I'm yet to be convinced that any alternative way of handling hyphenation in LaTeX beats the traditional way without needlessly compromising quality or actually upping the necessary manual work.

Quote:

Originally Posted by frabjous

Well, you don't seem to have any desire for reflowable format. I think you'll have to admit that many other people do. That's the unsolved problem.

I've already done so, possibly as much as a dozen, but certainly at least half a dozen, times in this thread... along with repeatedly stating that I thought the best solution was having an eBook file contain multiple fixed-layout versions for the most popular display size/font size combinations while also containing a reflow version.

As for the unsolved problem... it's unsolved, but not a problem. People who are fine with reflow formats do not complain about poor hyphenation.

Quote:

Originally Posted by frabjous

I agree with this: there is no reason why publishers couldn't offer a fixed format, or 3 or 5, for each book they sell, and I think they should continue to do so

That's good to hear. A point of agreement.

Quote:

Originally Posted by frabjous

until a much better unfixed-format renderer is found or used.

Such a renderer, unless you only care about English language books, would have to be several magnitudes more complex than the most sophisticated typesetting systems that exist today.

I can, with remarkable ease, use LaTeX to create typographically correct documents in English, French, Hungarian, et cetera. Hanzi documents, either horizontally or vertically, that respect the rules of Chinese typography. Documents with Thai, Georgian, Korean, Ethiopian text that respects those languages typographic and hyphenation rules (or lack thereof). Etruscan and Old Hungarian runic texts running either left, right, or even in boustrophedon. Documents that contain a mixture of greek, hebrew, arabic, and syriac texts. Or even Klingon, Tengwar, or Shavian.

The fact that I can do all these things, and infinitely more, is what makes the LaTeX/PDF/Fixed-layout option basically (given some small improvements in the resolution and contrast) as good as paper... and anything that cannot at least offer the same all-but-limitless possibilities "functionality" of paper (independent of whatever else it may be capable of) is not a viable replacement for paper (or paper books).

Unless of course one holds that in eBooks function should follow form, instead of the other way around.

- Ahi

frabjous · 09-08-2009, 04:40 PM

Quote:

Originally Posted by ahi

2. (ERROR) Those that LaTeX thinks it knows how to hyphenate, but--the word being an exception to whatever LaTeX hyphenation pattern matches is--in fact hyphenates incorrectly.

After using LaTeX for a couple years, I have to say I've never yet noticed an example of this. I'll admit that I may not have looked hard enough, though. Still, the fact that I haven't noticed such errors even if they exist means something.

Quote:

The "traditional" approach is to not worry about all this nonsense, and proofread the book to catch #2's, and manually fix badboxes to catch #4's.

Indeed, but that's because "traditionally", .tex documents have been used to produce a single fixed format document. If TeX were to be introduced onto readers as a way of handling reflow, presumably new conventions would be necessary.

I don't know if a tool presently exists to parse a LaTeX document and return a list of words that it doesn't know how to hyphenate, but if it doesn't, I cannot imagine that such a tool would be at all difficult to create, even if it meant digging into (La)TeX's source code a little bit. My thought is that at book creation, this would be run once to generate a list, and then the person writing the tex code would use a \hypenation tag to deal with all of them.

But you raise a good point as to how it easy it would be to get that algorithm to distinguish between your cases #3 and #4. I'll admit I don't know enough about LaTeX's hyphenation algorithm to know how easy this would be, but even it does pattern matching rather than word matching (--actually my own experience makes me think that LaTeX does store its hyphenation rules at the word-level rather than pattern-level, but I'm not sure--) I don't think it would be that hard. Most unhyphenateable words would be common one-syllable words, and a list of such words to check against does not seem like it would be difficult to generate. (And if a few got through during this process it wouldn't be a problem... the book creator would just specify that they can't be hyphenated...)

(Again, I'm restricting my comments to English and similar languages. The market for English is big enough to make this worthwhile...)

And LaTeX is not the only software out there that does hyphenation... there's also Scribus, InDesign (though I think their algorithm is based on TeX's), etc. Surely, this is not such an unreachable goal. I'd very surprised if more than a few paragraphs per book on average are "hand-hyphenated" now, even with good presses, to be honest, though I don't have any first-hand knowledge of such things.

Quote:

As for the unsolved problem... it's unsolved, but not a problem. People who are fine with reflow formats do not complain about poor hyphenation.

I'm not sure what you mean by "people who are fine with reflow". You make it sound like a bad thing. If you want examples of people who want both reflow while having decent (if not perfect) hyphenation, you can count me as such a person. I at least want the matter fully explored.

Quote:

Such a renderer, unless you only care about English language books, would have to be several magnitudes more complex than the most sophisticated typesetting systems that exist today.

A renderer that handled English (and languages similar enough to English) this way but handled other languages in a manner similar to how current ePub renderers worked seems like a decent compromise, at least to be sold in English-speaking markets. Again, I'm delighted with the idea of sending the source along with one or a few human-created fixed formats to be used unless the user wants custom reflow.

Geez, now you have me wondering whether Tengwar and Klingon, etc. have hyphenation rules...

Anyway, I admit that there are some assumptions I'm making here that may be wrong. I just haven't seen what I would consider compelling evidence against the possibility of such things.

ahi · 09-08-2009, 08:30 PM

Quote:

Originally Posted by frabjous

Anyway, I admit that there are some assumptions I'm making here that may be wrong. I just haven't seen what I would consider compelling evidence against the possibility of such things.

I agree, frabjous, that any one issue I raise will (perhaps even rightly) seem the sort of thing that will be solved via technological progress sooner rather than later. But unless you are very widely knowledgeable about an uncommonly broad variety of languages, there are almost certainly issues galore that neither you or I are even aware of, which would be obstacles of varying seriousness of their own.

And, personally, an eBook reading software/format whose primary (and perhaps sole realistic) aim is to support the English/Western world in the foreseeable future is... well, of absolutely no interest/worth/value to me... or to the majority of humanity (most of whom today are not eBook reading device customers, but if their needs are rendered basically impossible to meet with the established industry standards, they almost certainly never will be).

That is my view.

- Ahi

DawnFalcon · 09-08-2009, 08:34 PM

Quote:

Originally Posted by frabjous

I agree with this: there is no reason why publishers couldn't offer a fixed format, or 3 or 5, for each book they sell, and I think they should continue to do so until a much better unfixed-format renderer is found or used.

Again, fine. But unless every version is avaliable for one price, then a single fixed format will attract about 20% of my money a reflow format book would for me.

(And I'm sure most constuct languages do, one I'm familar with, the ironically-named PlusThink, a newspeak "derivative" does)

Also, right, TeX would need new conventions... let me ask again, is anyone actually working on this?

Ankh · 09-08-2009, 09:00 PM

Quote:

Originally Posted by ahi

As for the unsolved problem... it's unsolved, but not a problem. People who are fine with reflow formats do not complain about poor hyphenation.

Excuse me?

Fully justified text without hyphenation looks bad to me. I would rather stick with "jagged right edge" until hyphenation is solved.

frabjous · 09-08-2009, 10:18 PM

Quote:

And, personally, an eBook reading software/format whose primary (and perhaps sole realistic) aim is to support the English/Western world in the foreseeable future is... well, of absolutely no interest/worth/value to me... or to the majority of humanity (most of whom today are not eBook reading device customers, but if their needs are rendered basically impossible to meet with the established industry standards, they almost certainly never will be).

If different languages have such radically different typographical needs, I don't see that there would need to be, or even should be, a single standard for books published in all languages. If car manufacturers can make a separate version for each country, and in the US, sometimes different ones for different states, then surely, reader device manufacturers can put different firmwares/renderers, etc., in devices sold in different markets. (And I'd hope they offer ways to customize one's own firmware if a multilingual reader needed more than one renderer, even if it were impossible to use all.) I also would hope that the standard could be flexible enough to incorporate things that should be handled in different ways. Pie in the sky, perhaps.

Quote:

Originally Posted by DawnFalcon

Also, right, TeX would need new conventions... let me ask again, is anyone actually working on this?

There was a research talk about TeX as a eBook reader at the TUG conference this past summer, so apparently, yes. I don't have any first hand knowledge of their work, though.

acidzebra · 09-09-2009, 06:29 AM

Having played with LyX over the weekend all the way until today, I must revise my opinion of PDF as an ebook format. Here is my revised view:

Letter- or A4 sized PDFs, especially complex ones, tend to look like crap on the (smaller) reader screens. Specially-formatted PDFs look way better than anything else I have ever seen on the Sony, and are a faithful representation of the content as it was (intended to be) laid out.

This includes the pretty full justification, hyphenation (both fantastic features of LyX/LaTeX and I don't have any reason to complain about how it hyphenates, even in SF books with lots of made up words), fonts, TOC, graphics and whatnot.

The drawback is you lose some layout flexibility; press the zoom button and it all goes awry. Of course, formatting the PDF with a decent font + font size takes away the need to zoom to a great extent (though when my eyes are tired I like to do this). I don't see publishers putting out several PDFs of each books with different font sizes formatted for each size mainstream screen any time soon if ever, though.

So all in all, I am pretty happy with this rather combative thread; I have found LyX which takes some getting used to but once you have a decent profile/layout set up you can process books quickly and with minimal intervention (I am big on automation) and make them look great. I'd post some results here but most of my conversions go against the MR rules (copyright) and they would probably make the professional typographer's eyes bleed. Muahahahaaa!

ahi · 09-09-2009, 10:15 AM

Quote:

Originally Posted by Ankh

Excuse me?

Fully justified text without hyphenation looks bad to me. I would rather stick with "jagged right edge" until hyphenation is solved.

To the extent that hyphenation is already solved would probably be perfectly adequate to your needs. Or do you need vastly better hyphenation than Microsoft Word is able to provide? Typesetting without manual intervention does--but reflow rendering probably does not.

- Ahi

ahi · 09-09-2009, 10:43 AM

Quote:

Originally Posted by frabjous

If different languages have such radically different typographical needs, I don't see that there would need to be, or even should be, a single standard for books published in all languages. If car manufacturers can make a separate version for each country, and in the US, sometimes different ones for different states, then surely, reader device manufacturers can put different firmwares/renderers, etc., in devices sold in different markets. (And I'd hope they offer ways to customize one's own firmware if a multilingual reader needed more than one renderer, even if it were impossible to use all.) I also would hope that the standard could be flexible enough to incorporate things that should be handled in different ways. Pie in the sky, perhaps.

Well... I do feel so.

As long as I have to order from Europe the same damn Fisher Price toys being sold in the local Walmart, because I want the damn plush dog to count in Hungarian rather than English... instead of being able to (as you rightly suggest in the case of eBook reader devices) simply update/change the firmware... this seems unlikely to me.

And, yes, the key is multilingualism. Multilingualism that is, by the way, the norm in humans, as there are more multilingual people in the world than monolingual ones.

It seems to me that your approach, separately from whatever other merits or failings it may have, firmly places (potential) readers into the hands/mercy of manufacturers that have little tangible motivation to even update firmware, never mind create endless variations thereof for a myriad different scripts and languages.

And while doubtless your approach would not leave disenfranchised readers of Hanzi, whether vertical or horizontal, what is the likelihood of the particulars of the Yi script being supported? Or of the Chuvash language? (Consider these rhetorical questions relevant more for their type than for their specific content... I do not know what complications there would be with either the Yi script or with Chuvash... but the world is full of languages that are probably too minor for a large multinational corporation to ever care to do much work to support.)

Not to mention stuff that starts heading toward the fanciful... how much will Sony work to professionally support all the peculiarities of ancient/koine Greek, Hebrew, Latin, Arabic, and Syriac for biblical/koranic/classical scholarship?

How much will they work to support either Etruscan or Runic Hungarian (or Greek or Latin, for that matter) in boustrophedon? Will I be able to implement a fanciful way of writing modern English with Old English Runes? Will there be full and professional support for all the different scripts that Albanian has been written with in recent memory (and each of which doubtless has books originally written/prepared/published using it)? How high a priority will Inuktitut support be for them? It is a co-official language of one of Canada's provinces... along with it's related language/dialect Inuinnaqtun which may have quirks of its own.

And then there's the Klingon, Tengwar, et cetera tomfoolery. And will there only be a firmware for writing Quenya with Tengwar? Or both Quenya and Sindarin? Or even English? Will Dutch written with Tengwar never be supported?

And the voynich manuscript?

Nothing that I mentioned is a problem to do with real paper or with PDF, assuming the author/typesetter knows what they want. However it could very easily become all but impossible to do (or at least, to do well) using a system where most of the work is left to the display engine... which knows only what the proprietor put money in.

Unless you see firmware going open source in a big way... but part of me thinks that even then, writing firmware is a good deal greater a barrier to entry than simply learning LaTeX and getting the right fonts. The latter a determined language revivalist could far more readily do than the former, unless they already have a software development backgrounds.

- Ahi

ahi · 09-09-2009, 10:50 AM

Quote:

Originally Posted by acidzebra

Having played with LyX over the weekend all the way until today, I must revise my opinion of PDF as an ebook format. Here is my revised view:

Letter- or A4 sized PDFs, especially complex ones, tend to look like crap on the (smaller) reader screens. Specially-formatted PDFs look way better than anything else I have ever seen on the Sony, and are a faithful representation of the content as it was (intended to be) laid out.

This includes the pretty full justification, hyphenation (both fantastic features of LyX/LaTeX and I don't have any reason to complain about how it hyphenates, even in SF books with lots of made up words), fonts, TOC, graphics and whatnot.

The drawback is you lose some layout flexibility; press the zoom button and it all goes awry. Of course, formatting the PDF with a decent font + font size takes away the need to zoom to a great extent (though when my eyes are tired I like to do this). I don't see publishers putting out several PDFs of each books with different font sizes formatted for each size mainstream screen any time soon if ever, though.

So all in all, I am pretty happy with this rather combative thread; I have found LyX which takes some getting used to but once you have a decent profile/layout set up you can process books quickly and with minimal intervention (I am big on automation) and make them look great. I'd post some results here but most of my conversions go against the MR rules (copyright) and they would probably make the professional typographer's eyes bleed. Muahahahaaa!

Congrats!

Regarding the putting out of multiple PDFs for different font sizes and different screen sizes... I maintain that it sounds more onerous than it really is. Not to mention that as long as devices support proper resizing of PDFs, there can be useful sharing of sorts... e.g.: the 10pt & 6" screen version could be the 8" screen large print version.

Anyways though... good luck with your future PDF endeavours!

Edit: You could take screenshots and post those!

- Ahi

frabjous · 09-09-2009, 11:16 AM

acidzebra -- I'm glad you're enjoying LyX. Do consider making the move to standard LaTeX editing, however. LyX is more of a stepping stone, in my mind, anyway.

Quote:

Originally Posted by ahi

It seems to me that your approach, separately from whatever other merits or failings it may have, firmly places (potential) readers into the hands/mercy of manufacturers that have little tangible motivation to even update firmware, never mind create endless variations thereof for a myriad different scripts and languages.

Unfortunately, the fate of readers are already in the hands/mercy of manufacturers that have such little tangible motivation, and you've seen the results. I can't even find a darned way of putting a single Hebrew letter into a .mobi file I'm working on without using an inline image that gets pixelated horribly on a Kindle.

I wasn't really suggesting an approach for how to handle other languages, but mainly just trying to limit my positive proposal for a better renderer to languages I actually know something about. I only know English and a smattering of French and German. I'm simply not qualified to make recommendations about what would be an ideal renderer for other languages.

I don't know of a reason to think this is a matter of the format itself, however. Again, my arguments weren't really about whether the markup was LaTeX or (Math)(X)(HT)ML, but about pushing for a renderer that does a better job with what it's given. I don't see why the file format itself couldn't be suitably multinational--HTML and TeX are both so, as far as I know.

LaTeX's, and even more so, XeLaTeX's support for other languages is extensive, and I certainly wouldn't be opposed to that being the standard. Perhaps your point is that it would much more difficult for other languages to implement a renderer that could reflow well on the fly for some other languages, and for those, having the fixed formats to fall back to would therefore be all the more important. To repeat, I'm all for including the possibility of fixed format fallbacks. You yourself (I think it was you... too lazy to check) pointed out already early in the thread that even with current renderers for ePub you can get the exact look you want with a series of PNGs, and no one is suggesting moving to a format that wouldn't allow the insertion of arbitrary (but obviously nonreflowable) images.

But surely it's not some horrendous multinational tragedy, if for those languages in which a renderer that does decent typography and allows for arbitrary reflow at the same time IS possible, we actually use such a renderer.

I'm surprised at your skepticism about the firmware though: one possibility that is still live in my mind is using (pdf)LaTeX itself, or a tweaked derivative, as the renderer -- if it does the good job with these other languages that you say, why couldn't it do as good a job on our devices? (It would merely be a matter of preloading certain packages as default for different markets.) And even if there are legal barriers to this, to quote myself from earlier in the thread, the fact that the wheel has been invented once gives us all the more reason to think that reinventing it is not an impossibility.

ahi · 09-09-2009, 11:38 AM

Quote:

Originally Posted by frabjous

Unfortunately, the fate of readers are already in the hands/mercy of manufacturers that have such little tangible motivation, and you've seen the results. I can't even find a darned way of putting a single Hebrew letter into a .mobi file I'm working on without using an inline image that gets pixelated horribly on a Kindle.

However (with regards to your Hebrew issue), there would be few (if any) problems if you used PDFs made by yourself/somebody that knows what they are doing. And just about all eBook reading devices support PDF... so, in my mind, PDF is a format with a promise of lessening that dependence on manufacturers caring about stuff beyond original language Koontz novels.

The key, for me, is that LaTeX is great primarily because the people doing the typesetting with LaTeX know more than LaTeX does by itself. The moment you move all that work to render time, LaTeX (or whatever you use) suddenly needs to no a hell of a lot more... because there's no human to fix things prior to it getting into the reader's hands.

It's this transfer of miscellaneous knowledge that isn't encoded into software automation (because it is vastly better handled by a human being) that I see as a practical stumbling block, even if it is not a downright computational/mathematical one.

Quote:

Originally Posted by frabjous

But surely it's not some horrendous multinational tragedy, if for those languages in which a renderer that does decent typography and allows for arbitrary reflow at the same time IS possible, we actually use such a renderer.

It is, if this results in lesser/abandoned support for PDF... which can already handle it all as perfectly as the generating software/typographer can muster.

Quote:

Originally Posted by frabjous

I'm surprised at your skepticism about the firmware though: one possibility that is still live in my mind is using (pdf)LaTeX itself, or a tweaked derivative, as the renderer -- if it does the good job with these other languages that you say, why couldn't it do as good a job on our devices? (It would merely be a matter of preloading certain packages as default for different markets.) And even if there are legal barriers to this, to quote myself from earlier in the thread, the fact that the wheel has been invented once gives us all the more reason to think that reinventing it is not an impossibility.

I am skeptical because LaTeX works to automate typesetting work done by a human being, and was never intended to replace said human being. If it is possible for it to evolve to do so while still generating documents as good as even a mediocre typographer can with nary more than a few hours' worth of work, it would be through thousands upon thousands of man-hours of work thrown at a myriad problems that aren't problems at all with fixed layout eBooks or with present-day quality dynamic layout eBooks (which, by all indication, are perfectly satisfactory to the majority of eBook reading device owners, perhaps with the addition of mediocre quality hyphenation algorithms).

And why? Because multimillion dollar publishing houses should be spared the few hour travail of generating 2-6 PDFs from a common LaTeX source (that they could [or might already] also be using for their printing)? Or because we don't want two to six 200 KB to 400 KB PDF files bundled along with the 100 KB to 200 KB HTML in an eBook, despite the cost of memory storage distinctly heading toward dirt-cheap, and people seemingly having no inclination to store more than 200 books at a time on their eBook reading device?

This doesn't make sense to me.

Let my skepticism not stop work/research on automating more and more typographical/layout tasks... but a lot of it seems to me like a fool's errand that only exists because of a tacit disinterest in doing the work the right way and at the right stage.

- Ahi

frabjous · 09-09-2009, 02:07 PM

Well, let me say too that I certainly hope readers will continue to support PDF for a long time to come... if for no other reason than that there are so many around now that even if a better alternative comes along, I'll want my current stash of PDFs not to become useless. If anything I wrote suggested otherwise, I'll happily take it back. (And with my Hebrew letter problem, as you know, of course, I do distribute things in PDF and there things are fine -- I just wanted to do a .mobi version too since Kindle has such a share of the market, and at least the first two generations don't support PDF. )

I think we may have different estimates of how much work it would take to change existing technologies to deliver high-quality typesetting that could be automatically reflowed (and you're of course right that LaTeX was designed with a different purpose in mind and would have to be changed, updated or augmented in various ways) vs. how much work it would save and what the other benefits would be. But there's little point in trying to quantify such things; too much is unknown until it is seriously attempted.

I'll admit that part of the desire to create such technology is selfish. For one, I could use it for my own writing. I like what I write, even pre-publication, to look good. This is why I write in LaTeX, and distribute to colleagues for comments in PDF, whereas most others in my field write and distribute in Word, and wait until publication to get nice looking documents. Most of the reading I do is pre-publication stuff: whether it's student papers, work by colleagues who want feedback, or I'm refereeing a submission to a press pre-publication. If would be nice if I could take the files they sent me, do at most one universal conversion without deciding in advance on a fixed layout and font size, put them on my reader, and get decent looking results without deciding on a fixed format in advance.

I'll certainly settle in the meantime for TeX-advocacy, hoping to be sent the source to create my own appropriately sized PDFs.

ahi · 09-09-2009, 02:42 PM

Agreed, with most everything you wrote in your last message.

Quote:

Originally Posted by frabjous

I'll certainly settle in the meantime for TeX-advocacy, hoping to be sent the source to create my own appropriately sized PDFs.

I think advocating for the availability of TeX source files is brilliant! What would make it even more brilliant would be:

1) A tool that would create a TeX source bundle--a .zip basically containing all dependencies that aren't part of the TeX base, and are actually referenced/used during the compilation. (i.e.: A compilation under a valid TeX variant [unless there are necessary ties to XeLaTeX of course] should never result in failure due to missing dependencies.)

2) The creation of a TeX distiller, if you will, that can take one of these TeX source bundles, and, with some minimal run-time control (over display size and/or font size), generate a PDF out of them with little more than drag and drop by the user.

The combination of the two would basically mean that the typesetter could pre-customize the TeX source for specific sized outputs via use of the ifthen package (your idea, I believe)... meaning that PDFs of whatever size could be readily distilled, but anticipated size/font-size combination would look even better than merely auto-generated ones.

Even this, I do not think is ideal for run-time use... but it would make for an eBook distribution system (with a minor intermediary step that could even be integrated into the download workflow of the particular file type, once default parameters have been set) superior to basically anything else. Albeit without addressing the need of people who demand different font-sizes for different lighting conditions, instead of resigning themselves to having to read with adequate light.

Edit: Although, it might not be impossible to get LaTeX to output reasonably accurate HTML... not that I've seen any tool that does so tolerably. Usually the output is really quite ugly... though that might have more to do with the given approach's specific aims not simply being the generation of pretty (even if simplified/less than 100% accurate) HTML output from the LaTeX source.

- Ahi

09-09-2009, 06:29 AM	#563
acidzebra Liseuse Lover Posts: 869 Karma: 1035404 Join Date: Jul 2008 Location: Netherlands Device: PRS-505	Having played with LyX over the weekend all the way until today, I must revise my opinion of PDF as an ebook format. Here is my revised view: Letter- or A4 sized PDFs, especially complex ones, tend to look like crap on the (smaller) reader screens. Specially-formatted PDFs look way better than anything else I have ever seen on the Sony, and are a faithful representation of the content as it was (intended to be) laid out. This includes the pretty full justification, hyphenation (both fantastic features of LyX/LaTeX and I don't have any reason to complain about how it hyphenates, even in SF books with lots of made up words), fonts, TOC, graphics and whatnot. The drawback is you lose some layout flexibility; press the zoom button and it all goes awry. Of course, formatting the PDF with a decent font + font size takes away the need to zoom to a great extent (though when my eyes are tired I like to do this). I don't see publishers putting out several PDFs of each books with different font sizes formatted for each size mainstream screen any time soon if ever, though. So all in all, I am pretty happy with this rather combative thread; I have found LyX which takes some getting used to but once you have a decent profile/layout set up you can process books quickly and with minimal intervention (I am big on automation) and make them look great. I'd post some results here but most of my conversions go against the MR rules (copyright) and they would probably make the professional typographer's eyes bleed. Muahahahaaa! Last edited by acidzebra; 09-09-2009 at 06:31 AM.

09-09-2009, 02:07 PM	#569
frabjous Wizard Posts: 1,213 Karma: 12890 Join Date: Feb 2009 Location: Amherst, Massachusetts, USA Device: Sony PRS-505	Well, let me say too that I certainly hope readers will continue to support PDF for a long time to come... if for no other reason than that there are so many around now that even if a better alternative comes along, I'll want my current stash of PDFs not to become useless. If anything I wrote suggested otherwise, I'll happily take it back. (And with my Hebrew letter problem, as you know, of course, I do distribute things in PDF and there things are fine -- I just wanted to do a .mobi version too since Kindle has such a share of the market, and at least the first two generations don't support PDF. ) I think we may have different estimates of how much work it would take to change existing technologies to deliver high-quality typesetting that could be automatically reflowed (and you're of course right that LaTeX was designed with a different purpose in mind and would have to be changed, updated or augmented in various ways) vs. how much work it would save and what the other benefits would be. But there's little point in trying to quantify such things; too much is unknown until it is seriously attempted. I'll admit that part of the desire to create such technology is selfish. For one, I could use it for my own writing. I like what I write, even pre-publication, to look good. This is why I write in LaTeX, and distribute to colleagues for comments in PDF, whereas most others in my field write and distribute in Word, and wait until publication to get nice looking documents. Most of the reading I do is pre-publication stuff: whether it's student papers, work by colleagues who want feedback, or I'm refereeing a submission to a press pre-publication. If would be nice if I could take the files they sent me, do at most one universal conversion without deciding in advance on a fixed layout and font size, put them on my reader, and get decent looking results without deciding on a fixed format in advance. I'll certainly settle in the meantime for TeX-advocacy, hoping to be sent the source to create my own appropriately sized PDFs. Last edited by frabjous; 09-09-2009 at 02:11 PM.

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
bad format of pdf ebook reader	Adolfo00	Calibre	9	04-22-2010 12:11 PM
Convert PDF To Sony eBook Format?	Sjwdavies	Sony Reader	12	12-13-2009 03:15 AM
Free eBook for Kindle or pdf format	cmwilson	Deals and Resources (No Self-Promotion or Affiliate Links)	38	05-06-2009 03:32 AM
Master Format for multi-format eBook Generation?	cerement	Workshop	43	04-01-2009 12:00 PM
Format Comparison: PDF, EPUB, and Mobi Downloads from Ebook Bundles	Kris777	News	2	01-22-2009 04:19 AM

Advert

Advert