View Full Version : Ugly formatting


Tanzaku
12-11-2007, 09:57 AM
Admittedly, I am new to the entire eBook thing, but I love the concept! I do not, however, love the aesthetics. But, I am a publisher, and perhaps my eye use just trained too well for the current state of the art.

What do I mean?

There are certain typesetting and layout aesthetic conventions in publishing that are routinely missing from the eBooks I have seen. For example . . .

Widows (a single line or word at the top of a page)
Straight quotes (rather than curled quotes)
Straight apostrophes
Simple words at the end of line (e.g., "I", "a", etc.)
Words too widely spaced to fit justification
Slanted fonts rather than true italics


. . . just to name a few of the most obvious. Here is an example:

http://www.brooksjensenarts.com/typesetting.jpg

The formatting of eBooks seems very crude and jarring to the eye -- and doesn't need to be! There is nothing in the creation of an eBook that is any different from the process of creating a pbook. It's the skill of the typesetter and book designer that is missing from the eBook world. I suspect quantity is trumping quality in this regard -- which is perhaps a virtue in it's own right. For cranking out thousands of titles, tools like BookDesigner are great for what they do, but why can't we do a better job of making something that is pleasing to the eye as well as content rich?

Here is a draft I'm working on that addresses these issues. I'm using InDesign CS3 to create the layout design using Gutenberg text and then outputting as a PDF. (The Warden by Anthony Trollope.) The idea is to develop the InDesign template so it is easy to format books as needed for different screen readers/dimensions, as well as the easy change of fonts document wide via paragraph style definitions. (For those familiar with InDesign.)

I've uploaded both the PDF and the InDesign file to my website if anyone wants to look or play with this idea.

http://www.brooksjensenarts.com/warden.pdf
http://www.brooksjensenarts.com/warden.zip
(The zip file contains the InDesign file.)


The InDesign file uses Minion Pro font, so if you don't have that, simply replace the paragraph definition with the font of your choice. If you don't like the font sizes, easy to change them in the InDesign original. Also, the page size is set up for my Sony Reader -- which can easily be changed to another page size or margin distance. Have a ball and create the PDF of your visual dreams! Consider InDesign in this regards as BookDesigner on serious steroids!

Which, BTW, brings to the issue of PDF and readability on eBook readers. The problem is not in the PDF format, but rather the page layout and design. True, many PDFs are created for the "letter size" world, but they don't need to be. A PDF designed for an eBook reader will look fabulous! In fact, it will look far better, more polished, and professional that anything possible in the LRF format. I know that creating high-quality PDFs is probably not something lots of folks will want to do, and the InDesign CS3 program is not inexpensive, but if you are so inclined, I'd encourage you to do so. I will be! Personally, if I am going to spend 20-40 hours reading a long novel, I'd prefer to have the visual experience be as good as it can be. :)

And, if anyone is interested, as I create my own eBooks, does anyone else want them? I'd be happy to upload them. What if we had an special upload area for these -- PDF books as well as the InDesing (or Quark, or Publisher, or PageMaker, etc.) files that created them? Just an idea! :)

DaleDe
12-11-2007, 10:23 AM
Admittedly, I am new to the entire eBook thing, but I love the concept! I do not, however, love the aesthetics. But, I am a publisher, and perhaps my eye use just trained too well for the current state of the art.

What do I mean?

There are certain typesetting and layout aesthetic conventions in publishing that are routinely missing from the eBooks I have seen. For example . . .

Widows (a single line or word at the top of a page)
Straight quotes (rather than curled quotes)
Straight apostrophes
Simple words at the end of line (e.g., "I", "a", etc.)
Words too widely spaced to fit justification
Slanted fonts rather than true italics



Here is a draft I'm working on that addresses these issues. I'm using InDesign CS3 to create the layout design using Gutenberg text and then outputting as a PDF. (The Warden by Anthony Trollope.) The idea is to develop the InDesign template so it is easy to format books as needed for different screen readers/dimensions, as well as the easy change of fonts document wide via paragraph style definitions. (For those familiar with InDesign.)

I've uploaded both the PDF and the InDesign file to my website if anyone wants to look or play with this idea.

http://www.brooksjensenarts.com/warden.pdf
http://www.brooksjensenarts.com/warden.zip
(The zip file contains the InDesign file.)


The InDesign file uses Minion Pro font, so if you don't have that, simply replace the paragraph definition with the font of your choice. If you don't like the font sizes, easy to change them in the InDesign original. Also, the page size is set up for my Sony Reader -- which can easily be changed to another page size or margin distance. Have a ball and create the PDF of your visual dreams! Consider InDesign in this regards as BookDesigner on serious steroids!

Which, BTW, brings to the issue of PDF and readability on eBook readers. The problem is not in the PDF format, but rather the page layout and design. True, many PDFs are created for the "letter size" world, but they don't need to be. A PDF designed for an eBook reader will look fabulous! In fact, it will look far better, more polished, and professional that anything possible in the LRF format. I know that creating high-quality PDFs is probably not something lots of folks will want to do, and the InDesign CS3 program is not inexpensive, but if you are so inclined, I'd encourage you to do so. I will be! Personally, if I am going to spend 20-40 hours reading a long novel, I'd prefer to have the visual experience be as good as it can be. :)

And, if anyone is interested, as I create my own eBooks, does anyone else want them? I'd be happy to upload them. What if we had an special upload area for these -- PDF books as well as the InDesing (or Quark, or Publisher, or PageMaker, etc.) files that created them? Just an idea! :)

I too hate the fact that most eBooks do not have the features however it is not because the ability does not exist. Most dedicated creation programs have ways to deal with many of the issues you raised. There is an upload eBook capability on this site. It would be great if yo can post some eBooks to the site.

I get frustrated by the quality of many books you download from the various sources and end up converting some of them myself. Only it does take some time.

edit: By the way page 5 of your book needs fixing. The line at the bottom is a widow.

JSWolf
12-11-2007, 10:40 AM
The problem is not the book itself, but the software used to read the ebooks.

1. No hyphenation
2. No widow support
3. No font families, just the main normal font
4. No curly quotes/curly apostrophes

I don't see how simple words are a problem. I just too a pbook, opened it to a random page and found simple words. If you don't have the simple words, you can end up with larger spacing between the words in the line.

With libprs500 and Book Designer, you can control the word spacing. I tend to use a smaller then normal word spacing. But part of the issue with that is lack of hyphenation. If we has that, we'd have less lines with wide word spacing.

As for #4, I most of the time do bother to fix them. They look better when not the straight kind.

As for 1-3, these can be easily fixed in software. But will they be? What we should do is let Sony know of these issues and hopefully when we get the next firmware update, they'll be fixed.

DaleDe
12-11-2007, 10:47 AM
The problem is not the book itself, but the software used to read the ebooks.

1. No hyphenation
2. No widow support
3. No font families, just the main normal font
4. No curly quotes/curly apostrophes

I don't see how simple words are a problem. I just too a pbook, opened it to a random page and found simple words. If you don't have the simple words, you can end up with larger spacing between the words in the line.

With libprs500 and Book Designer, you can control the word spacing. I tend to use a smaller then normal word spacing. But part of the issue with that is lack of hyphenation. If we has that, we'd have less lines with wide word spacing.

As for #4, I most of the time do bother to fix them. They look better when not the straight kind.

As for 1-3, these can be easily fixed in software. But will they be? What we should do is let Sony know of these issues and hopefully when we get the next firmware update, they'll be fixed.

I do not believe that it is the software that is used to read the book that is at fault. It is the software used to generate the book that has to work properly. Except in the case of MobiPocket and some other PDA softare the pages are generally prebuilt for most devices. So it really depends on the device as to where the problem lies.

Dale

igorsk
12-11-2007, 10:53 AM
Typesetters complain about crappy typesetting, font designers complain about crappy font rendering, and the rest of us just keep reading :)
Yes, those issues are important, but not important enough for me to go back to paper books. For others it could be. I believe we'll see improvements in typesetting and font rendering once eBooks get popular enough.

JSWolf
12-11-2007, 10:55 AM
I do not believe that it is the software that is used to read the book that is at fault. It is the software used to generate the book that has to work properly. Except in the case of MobiPocket and some other PDA softare the pages are generally prebuilt for most devices. So it really depends on the device as to where the problem lies.

Dale
1. No hyphenation
2. No widow support
3. No font families, just the main normal font

These issues can all be fixed in the software used to read the ebooks. The ebooks don't need any changing for this. The reading software has to do it. You change the font size and you change the layout. There is no way the ebook needs to to it as the ebook layout would then have to be fixed (no reflowing).

igorsk
12-11-2007, 10:56 AM
I do not believe that it is the software that is used to read the book that is at fault. It is the software used to generate the book that has to work properly. Except in the case of MobiPocket and some other PDA softare the pages are generally prebuilt for most devices. So it really depends on the device as to where the problem lies.

Erm, sorry? Very few eBook formats are fixed-page like PDF. Most are basically text streams with some formatting, because one of the major pros of eBooks is that they are reflowable and pages can be reformatted on the fly.

Tanzaku
12-11-2007, 10:59 AM
edit: By the way page 5 of your book needs fixing. The line at the bottom is a widow.

There are some legitimate debates as to style issues -- one of which is the question of widows at the bottom of pages. It's a pretty universal consensus that widows at the top are taboo, but I fall into the camp that will accept a widow at the bottom of a page. For people who prefer no widows at the bottom, a quick paragraph style change would easily resolve this. Personal preferences do have a place in this discussion!:)

kacir
12-11-2007, 11:06 AM
:thumbsup: Thank you Tanzaku for bringing up this subject.
The book layout looks great on the computer screen. I can't wait to try it out on the reader.

I was ranting and complaining about the quality of the ebooks formated for Reader for a long time. Nobody seems to share my views.

The most jarring to my eye is the way that full justification works in default settings - like a typical lrf book. That is why I convert most of the stuff I read to left justification. I was unable to make such a nice evenly spaced paragraphs. And being unable to make such a nicely spaced paragraphs I prefer jagged right margin to uneven word spacing in a typical Connect book.

InDesign is simply great. Unfortunately I do not have access to a legal licenced version of InDesign anymore, so I use sans serif font sized 12 for my and left justification on my Reader.

Please consider using sans-serif font. Just try it. I know that serif fonts are much more pleasant to readers eye when they are printed with high enough resolution. With a very low resolution e-ink display (and 170DPI *IS* a very low resolution when you talk about typography) that default heavily hinted 12 point sans serif font really does look better.
( see Hinting (http://en.wikipedia.org/wiki/Hinting) at Wikipedia)

I also think that your book might look even better if you set some spacing between paragraphs (something like 110% ot the normal space between to lines)

By the way, ;)
- on the very first line of the very first paragraph you have a "hanging" a
- the same is the last line of the first page
- last line of the third page is an "orphan"
- fifth page has an "orphan" as well
I personally do not mind, but you seem to make point of not having hanging single letters at the end of the line, widows, orphans ... ;)

slayda
12-11-2007, 11:10 AM
As I see it (as an engineer - not a writer, grammarian, or publisher), the main reasons for "some" of these failings are;

Limited fonts available
Mostly fully justified rather than left justified
Limited viewing space (exacerbated by reflowing text at different sizes).


Advances in software & CPU capacity can address the first. The three taken together are, I believe, the main things that make ebooks have a "not perfect" appearance. {Please note that pbooks also have a "not perfect" appearance even though they do not necessarily suffer from these disadvantages.}

In addition, I tend to prefer the straight quotes.:2thumbsup

JSWolf
12-11-2007, 11:20 AM
As I see it (as an engineer - not a writer, grammarian, or publisher), the main reasons for "some" of these failings are;

Limited fonts available
Mostly fully justified rather than left justified
Limited viewing space (exacerbated by reflowing text at different sizes).


Advances in software & CPU capacity can address the first. The three taken together are, I believe, the main things that make ebooks have a "not perfect" appearance. {Please note that pbooks also have a "not perfect" appearance even though they do not necessarily suffer from these disadvantages.}

In addition, I tend to prefer the straight quotes.:2thumbsup
Just add in the other fonts needed to make a true font family. That problem will then be solved. Full justified would not be much of an isue if we have hyphenation support. And it would also fix the spacing with larger text sizes as well.

HarryT
12-11-2007, 11:33 AM
3. No font families, just the main normal font


The Gen3 supports font families. If you're reading in (say) Times New Roman, and the book has italics, then it uses the proper TNR Italic font (if it's on the machine).

DaleDe
12-11-2007, 11:41 AM
Erm, sorry? Very few eBook formats are fixed-page like PDF. Most are basically text streams with some formatting, because one of the major pros of eBooks is that they are reflowable and pages can be reformatted on the fly.

Yes, the problem is more complicated than I mentioned. My eBookwise reader does not reformat on the fly per se. It seems to pre-format the 2 font sizes when the file is built. Sony can reformat but it doesn't actually do in on the fly either. It have a full processing step in batch mode the first time you select a font size (connect preformats all 3 sizes). As I think about it there are two problems. One of characters and pretty printing and the other is pagination issues. They may require different solutions.

The full justification issue requires kerning to be done correctly and I don't believe any current reader supports this. You need to adjust the spaces between the characters in words to make everything look good when the lines are short (as they are in readers with larger fonts) and even then there will always be a line that doesn't look right.

Hyphenation helps but, unless the book has a hyphenation dictionary built in this often results in funny breaks in the text that detract from the reading experience. In my eBookwise books I sometimes code soft hyphens in the source to aid in fixing this. Having soft hyphens in the source can alleviate the need for a hyphen dictionary in the reader.

There is actually many different topics to talk about in this thread but that is enough for now.

Dale

Tanzaku
12-11-2007, 12:18 PM
By the way, ;)
- on the very first line of the very first paragraph you have a "hanging" a
- the same is the last line of the first page
- last line of the third page is an "orphan"
- fifth page has an "orphan" as well
I personally do not mind, but you seem to make point of not having hanging single letters at the end of the line, widows, orphans ... ;)

Yes, right-o. We are still working on the template and the GREP replacements -- hence I refer to this sample as a "draft." There are other issues, too, that are yet to be resolved. My purpose in the post was to simply bring up the issue and to see what others had to say.

We'll try the non-serif font. Not my favorite choice, but you might be right for basic text.

See my comments above about end-of-page widows.

Question: As to paragraph spacing, would you rather see a first-line indent without a space between paragraphs (as it is now), or a flush-left and 1-line space between paragraphs? We could also turn off the base line rules and do a 110% space between paragraphs as you suggest. Preferences?

Brooks

JSWolf
12-11-2007, 12:20 PM
I downloaded this PDF and put it on my 505. What I found was that the size of the text is too small. Can this be made larger so it looks good on a 6" eink screen?

HarryT
12-11-2007, 12:40 PM
I downloaded this PDF and put it on my 505. What I found was that the size of the text is too small. Can this be made larger so it looks good on a 6" eink screen?

That's one of the problems with PDF, of course - the fact that it can't generally re-flow and be zoomed. By using it, one is losing one of the main benefits of an eBook reader - the ability to read the book in whatever font and size one wishes. For me, that's a more important benefit that the "problem" of such things as widows and orphans which, speaking personally, bother me not one jot.

tompe
12-11-2007, 12:49 PM
Om the Gen3 it seems that it will accept non-justification if the word spacing becomes to bad. And that I think was a good design decision since one line that does not line up onto the right margin is not so disturbin as a line with to much space between words.

The hypenation algorithm used in LaTeX/TeX shoud be possible to implement on these kind of devices and it does not need large databases.

kovidgoyal
12-11-2007, 01:07 PM
For spacing between words, the problem is that SONY's reader software attempts to do full justification even when the line is too short. It's just badly designed.

JSWolf
12-11-2007, 03:15 PM
One problem I have seen that is so EASY to fix is word—word where the first word would fit on the line above, but because of the em dash it won't fit. if I am reading a book that I've converted and find the em dashes make too much of a mess, I sometimes go back and convert the em dash to an en dash like this word – word. When reading I can easily read it like it would have been had it been the first example with an em dash.

dash -
em dash —
en dash –

kacir
12-11-2007, 03:35 PM
Yes, right-o. We are still working on the template and the GREP replacements -- hence I refer to this sample as a "draft."
My mother language has lots of one letter and two-letter elements (like a, or, ...) and it is considered a very unapropriate to leave those "hanging" at the end of the line. Even children are taught that. The problem is avoided when you replace all combinations of: Space A_letter Space with Space A_letter Non_breaking_space
My language version of Microsoft word does that automatically.
I am very surprised that InDesign does not have tool for automatically dealing with this.

We'll try the non-serif font. Not my favorite choice, but you might be right for basic text.

Try it. I am looking forward to seeing results.
Citation from Wikipedia:
http://en.wikipedia.org/wiki/Serif_font
----- quote -----
While in print serifed fonts are considered more readable, sans-serif is considered more legible on computer screens.[citation needed] For this reason the majority of web pages employ sans-serif type. Hinting information, anti-aliased and sub-pixel rendering technologies have partially mitigated the legibility problem of serif fonts, but the basic constraint of coarse screen resolution—typically 100 pixels per inch or less—continues to limit their readability on screen.
---- end of quote ----

Question: As to paragraph spacing, would you rather see a first-line indent without a space between paragraphs (as it is now), or a flush-left and 1-line space between paragraphs? We could also turn off the base line rules and do a 110% space between paragraphs as you suggest. Preferences?
My *personal* preference is to have a little first line intent (three spaces) plus 110% or 120% space between paragraphs. Definitely not a full empty line between paragraphs. Screen real estate is too valuable for that.

Hadrien
12-11-2007, 03:58 PM
Om the Gen3 it seems that it will accept non-justification if the word spacing becomes to bad. And that I think was a good design decision since one line that does not line up onto the right margin is not so disturbin as a line with to much space between words.

The hypenation algorithm used in LaTeX/TeX shoud be possible to implement on these kind of devices and it does not need large databases.

The hyphenation feature of FBReader actually use the LaTeX/TeX algorithm.

Tanzaku
12-11-2007, 04:51 PM
I am very surprised that InDesign does not have tool for automatically dealing with this.

Yes, you are right, it does. In fact, that's the exact pattern for GREP or basic TEXT replacements. The draft document we posted has already had the single word "I[space]" replaced with "I[nonbreaking space]" to illustrate the point. We simply haven't yet run the rest of the bad-boy offenders through their paces yet.;)

As to the serif versus non-serif issue, again you are right. The common wisdom is to use non-serif fonts for screen displays and serif fonts for the text of printed displays. So, which category does the eBook fall into? Printed or screen? Or somewhere in between -- in which case, it is debatable which camp this technology falls into. We do, indeed, live in interesting times.

BTW, I see that Hadrien has responded with a post in this thread. I am most impressed with Feedbooks approach to this problem and their results are very good for the current state of the art. I have no doubt that as time goes on their algorithms will get better and better. At least I hope so! I love their approach to customizing the PDF creation with personalized fonts, margins, and headers/footers. Very user friendly. I can see a time when they will have customizable leading, hyphenation, paragraph styles, etc that will easily approach the kind of professional layout possible with InDesign -- but customizable on the fly. They are actually not that far from it now and clearly demonstrate a keen knowledge of the PDF architecture. So, if you are not familiar with their site, may I give them a plug?

JSWolf
12-11-2007, 05:00 PM
Tanzaku are you going to ignore the problem that your PDF is unusable because the font size is too small?

Jadon
12-11-2007, 05:50 PM
A plus having a set font size or two like the eBookwise is that it can produce texts without widows or orphans, and my conversion CSS file does that. I'm of the ragged-right school, since adding soft hyphens to allow smooth justification is too much work for my personal conversions. I do have Tidy make straight quotemarks and apostrophes curly. I've never before read of any "no one-letter words at the end of lines" rule, and three random paperbacks I just looked at don't seem to have, either. But the hard-space method could let me cater to it if desired, just as more extensive work to soft-hyphenate long words could let me get smooth justification even at the large font size I use. So in theory an eBookwise could follow every rule cited.

Tanzaku
12-11-2007, 05:57 PM
Tanzaku are you going to ignore the problem that your PDF is unusable because the font size is too small?

No, we're still fussing with the template and will have some more samples with the various suggestions from this thread. What font size are you comfortable with? The draft PDF was as 9-point font for the basic text.

BTW, another option we are looking at is using a font with a larger x-height. (The x-height is essentially the size of the lower half of the letter -- for example the full height of the letter "e" but only the part of the letter "t" below the cross.) Fonts with a larger x-height are easier to read at the same font size. Examples: New Century Schoolbook is an example of a font with a large x-height. Palatino is an example of a font with a smaller x-height. Hence, New Century Schoolbook at 9-points is actually easier to read than Palatino at 10-points. One of the worst reading fonts is Times and Times New Roman -- unfortunately a very commonly used font for simple PDFs because it is one of the default fonts in the PDF reader. Too bad!

Anyway, thanks for prodding me about your question. :) Hope this helps!

Tanzaku
12-11-2007, 05:58 PM
So in theory an eBookwise could follow every rule cited.

Very cool! :cool:

DaleDe
12-11-2007, 06:23 PM
A plus having a set font size or two like the eBookwise is that it can produce texts without widows or orphans, and my conversion CSS file does that. I'm of the ragged-right school, since adding soft hyphens to allow smooth justification is too much work for my personal conversions. I do have Tidy make straight quotemarks and apostrophes curly. I've never before read of any "no one-letter words at the end of lines" rule, and three random paperbacks I just looked at don't seem to have, either. But the hard-space method could let me cater to it if desired, just as more extensive work to soft-hyphenate long words could let me get smooth justification even at the large font size I use. So in theory an eBookwise could follow every rule cited.

I pretty much do what you do. I only add soft-hypens when I notice that the large font doesn't flow as well as I might like, particularly in a table (although sometimes I just force the small font for a table). Have you get everything automated?

Dale

DaleDe
12-11-2007, 06:24 PM
No, we're still fussing with the template and will have some more samples with the various suggestions from this thread. What font size are you comfortable with? The draft PDF was as 9-point font for the basic text.

BTW, another option we are looking at is using a font with a larger x-height. (The x-height is essentially the size of the lower half of the letter -- for example the full height of the letter "e" but only the part of the letter "t" below the cross.) Fonts with a larger x-height are easier to read at the same font size. Examples: New Century Schoolbook is an example of a font with a large x-height. Palatino is an example of a font with a smaller x-height. Hence, New Century Schoolbook at 9-points is actually easier to read than Palatino at 10-points. One of the worst reading fonts is Times and Times New Roman -- unfortunately a very commonly used font for simple PDFs because it is one of the default fonts in the PDF reader. Too bad!

Anyway, thanks for prodding me about your question. :) Hope this helps!

It is not the size of the font that needs changing. It is the page size of the PDF itself. PDF files on e-Book devices are shrunk to align the page with the 6" image thus his comment about the font size.

JSWolf
12-11-2007, 06:39 PM
No, we're still fussing with the template and will have some more samples with the various suggestions from this thread. What font size are you comfortable with? The draft PDF was as 9-point font for the basic text.

BTW, another option we are looking at is using a font with a larger x-height. (The x-height is essentially the size of the lower half of the letter -- for example the full height of the letter "e" but only the part of the letter "t" below the cross.) Fonts with a larger x-height are easier to read at the same font size. Examples: New Century Schoolbook is an example of a font with a large x-height. Palatino is an example of a font with a smaller x-height. Hence, New Century Schoolbook at 9-points is actually easier to read than Palatino at 10-points. One of the worst reading fonts is Times and Times New Roman -- unfortunately a very commonly used font for simple PDFs because it is one of the default fonts in the PDF reader. Too bad!

Anyway, thanks for prodding me about your question. :) Hope this helps!
In Book Designer, I like to use 11point as the main font size. The size of the page should be 90mmx120mm which is 3.54x4.72 inches. 9 point is just way too small font he base font. The only time I use 9 point is sometimes for the ToC.

Jadon
12-11-2007, 07:00 PM
Have you get everything automated?
Not remotely. I convert whatever format I start with to HTML, then run that through BookFixer, then edit that in The Semware Editor. There I add various things, like a title block and changing headers for an anthology, say. And I make lots of substitutions and checks, like searching for and fixing &ldquo;</p> from where Tidy gets confused. A 300-page novel might take a half-hour of editing, and that just gets it to my good-enough standards. I won't find OCR errors, or plenty of other things, but I'll get the formatting up to a level where I might be able to read ten pages between annoying errors.

DaleDe
12-11-2007, 09:54 PM
Not remotely. I convert whatever format I start with to HTML, then run that through BookFixer, then edit that in The Semware Editor. There I add various things, like a title block and changing headers for an anthology, say. And I make lots of substitutions and checks, like searching for and fixing &ldquo;</p> from where Tidy gets confused. A 300-page novel might take a half-hour of editing, and that just gets it to my good-enough standards. I won't find OCR errors, or plenty of other things, but I'll get the formatting up to a level where I might be able to read ten pages between annoying errors.

Thanks, that is about where I am also. Still a little too much manual fixup but doable.

RWood
12-11-2007, 10:23 PM
No, we're still fussing with the template and will have some more samples with the various suggestions from this thread. What font size are you comfortable with? The draft PDF was as 9-point font for the basic text.

BTW, another option we are looking at is using a font with a larger x-height. (The x-height is essentially the size of the lower half of the letter -- for example the full height of the letter "e" but only the part of the letter "t" below the cross.) Fonts with a larger x-height are easier to read at the same font size. Examples: New Century Schoolbook is an example of a font with a large x-height. Palatino is an example of a font with a smaller x-height. Hence, New Century Schoolbook at 9-points is actually easier to read than Palatino at 10-points. One of the worst reading fonts is Times and Times New Roman -- unfortunately a very commonly used font for simple PDFs because it is one of the default fonts in the PDF reader. Too bad!

Anyway, thanks for prodding me about your question. :) Hope this helps!
I have found New Century Schoolbook to be the easiest font to read. The large x-height as well as the open glyphs allow quick identification of the word shapes. My reading speed increases when NCS is used. Likewise, Baskerville or even New Baskerville will decrease my reading speed and produce a headache after about 15 minutes of reading. TNR is a condensed font by all standards except its own. It was designed to fit the most characters on a printed page at a given size as possible.

The current Sony Reader practice of making oblique fonts on the fly (along with bold and bold-oblique) is a trade-off between processor power, storage requirements (for the extra fonts), and the IO requirements of the hardware system design. I have made several books in BookDesigner (for my use alone) implementing full font families and the results (while a bit slower on page turns and bit larger in size) are very rewarding.

JSWolf
12-11-2007, 10:26 PM
Rwood, how did you manage to implement full font families using Book Designer? I'd love to know how. Also do you know if it is possible to just implement a bold font and an italic font? I don't mind them being times new roman and the standard font being the dutch one.

RWood
12-11-2007, 11:38 PM
Rwood, how did you manage to implement full font families using Book Designer? I'd love to know how. Also do you know if it is possible to just implement a bold font and an italic font? I don't mind them being times new roman and the standard font being the dutch one.
On the Make eBook|Sony Reader|Styles Tab I added the external font family and then assigned them to specific paragraph types. I did set the style for each to normal to keep the Reader from obliqueing (is that the way to spell it?) the glyph. It works great except for intra paragraph italic where it still obliques the base font.

DaleDe
12-12-2007, 01:02 AM
I have found New Century Schoolbook to be the easiest font to read. The large x-height as well as the open glyphs allow quick identification of the word shapes. My reading speed increases when NCS is used. Likewise, Baskerville or even New Baskerville will decrease my reading speed and produce a headache after about 15 minutes of reading. TNR is a condensed font by all standards except its own. It was designed to fit the most characters on a printed page at a given size as possible.

The current Sony Reader practice of making oblique fonts on the fly (along with bold and bold-oblique) is a trade-off between processor power, storage requirements (for the extra fonts), and the IO requirements of the hardware system design. I have made several books in BookDesigner (for my use alone) implementing full font families and the results (while a bit slower on page turns and bit larger in size) are very rewarding.

New Century Schoolbook has always been my favorite font. I always try and use it when I work in Framemaker.

Dale

GregS
12-12-2007, 04:28 AM
Nice thread about an important problem.

There are a number of problems associated with ebooks that need attention. First, for serious rather than recreational reading page numbers are now meaningless, at best as internal references to p-book editions. Standard formats need to include rational numbering schemes as mandatory requirements.

If we end up numbering chapters and paragraphs, the option of showing the numbers has to be dealt with elegantly - at the moment this is a black-hole. Page numbers are effectively gone, and nothing is there to replace them.

Second, and I am strongly biased in this towards non-fiction works, reflowing formats are fine for most novels, however, things get a lot more complex when serious non-fiction texts become involved and frankly we are nowhere close to solving this one.

I believe PDF is the way to go, but not in the way it is implemented. Large print books are needed, sometimes more words on the page are critical (small type) reading devices will always vary in size, some readers will want references in the margins (Shakepeare's plays for instance, but many other works being used for study would benefit, precisely because page numbers are a dead issue) and the idea of catering for this with different PDF versions of the same book becomes mind bogglingly complex.

The contradiction is making a fixed typographic system (PDF) into a semi-dynamic one. I use the term semi-dynamic, because there is little need to make it on-the-fly, after all the devices exist before the ebook, it should be easy enough to generate material for the size of different devices.

That is the problem as I see it. Does TeX fill the bill? Or is it necessary to approach it from a macro style sheet point of view? I tend to favour the latter solution.

kkingdon
12-12-2007, 07:08 AM
There are a number of problems associated with ebooks that need attention. First, for serious rather than recreational reading page numbers are now meaningless, at best as internal references to p-book editions. Standard formats need to include rational numbering schemes as mandatory requirements.

If we end up numbering chapters and paragraphs, the option of showing the numbers has to be dealt with elegantly - at the moment this is a black-hole. Page numbers are effectively gone, and nothing is there to replace them.

I've been thinking that reflowable ebook editions of serious works should include an extra index of "page numbers" taken from the page numbers of some reference fixed-layout edition. The reference page numbers need not always be displayed, but should easily be called up on demand for momentary display or for inclusion in an annotation. Chapter/paragraph references should also be available in a similar way as both a navigation index and on-demand displayable property of the current reading location, though such references need not be linked to a particular fixed-layout edition of the work.

Lexicon
12-12-2007, 11:18 AM
I'll admit to being a little surprised to see so much support here for hard coding the layout of eBooks. Given the myriad eReading devices that currently exist, a number which is only going to increase, surely it is pretty much impossible to code the layout of a single file such that it appears equally good on all eReaders? Are people really prepared to spend time reformating and hand-tuning their eBooks again and again as they upgrade their devices?

I thought HTML had taught us all about the disadvantages of integrating presentation instructions with content. If people are going to devote time to marking up books then I think they'd be better served marking up meaning rather than presentation. Defining your eBook in terms of logical book elements like title, author, chapters and footnotes makes more sense than adding font tags and layout tables.

I believe that presentation should be handled by software on each eReader - the reflowing of text, widow and orphan detection, hyphenation, etc. could therefore be tailored to the strengths and weaknesses of each device. CSS shoud be supported so the designer can exert some control over the display and provide an attractive default presentation but the reader should be able to override this with their own CSS files if they feel it necessary.

I've been looking at the epub spec recently and I found it supports the DAISY standard - which was originally designed as a way of marking up a book (in terms of t-o-c, chapters, appendices and so on) so that it can be rendered into braille or interpreted by text-to-speech engines. It's XML so it can be formatted into readable text (for display on an eReader) using CSS files included in the epub container.

I'd urge anybody interested in converting pBooks to eBooks to take a look at the DAISY (http://www.daisy.org/z3986/structure/structguide.htm#contents) standard. Consider the advantages of only having to mark up a book once such that it is readable on all epub devices and accessible to the visually impaired.

JSWolf
12-12-2007, 01:10 PM
Nice thread about an important problem.

There are a number of problems associated with ebooks that need attention. First, for serious rather than recreational reading page numbers are now meaningless, at best as internal references to p-book editions. Standard formats need to include rational numbering schemes as mandatory requirements.

If we end up numbering chapters and paragraphs, the option of showing the numbers has to be dealt with elegantly - at the moment this is a black-hole. Page numbers are effectively gone, and nothing is there to replace them.

BBeB has page numbers. And when you look at different versions of pbooks and editions, you can end up with different page numbering. If you are So basically, when you refer to page numbers on the ebook, you refer to the page number and text size and then you have it.

GregS
12-12-2007, 05:34 PM
BBeB has page numbers. And when you look at different versions of pbooks and editions, you can end up with different page numbering. If you are So basically, when you refer to page numbers on the ebook, you refer to the page number and text size and then you have it.

The id attribute is more than sufficient when linked to XMLnamespace for unambiguous references. But it takes two factors to work.

First the XMLnamespace must be truly unique for all time, unfortunately the common method of URL + location is not that, so a simple system of generating truly unique identities is needed (it not hard and a perfectly good system already exists).

The second factor is to logically apply id attribute ids to every publication (which can be achieved automatically with a simple script).

The problem with the text size is that device will change size overtime - my opinion is that the page number is basically dead as a reference point for electronic literature, milestone page numbers could be used but XML already supplies the means to elegantly deal with the problem.

DaleDe
12-12-2007, 05:39 PM
BBeB has page numbers. And when you look at different versions of pbooks and editions, you can end up with different page numbering. If you are So basically, when you refer to page numbers on the ebook, you refer to the page number and text size and then you have it.

Can you actually tell the text size? Other than s,m,l which varies from book to book?

Dale

DaleDe
12-12-2007, 05:43 PM
The id attribute is more than sufficient when linked to XMLnamespace for unambiguous references. But it takes two factors to work.

First the XMLnamespace must be truly unique for all time, unfortunately the common method of URL + location is not that, so a simple system of generating truly unique identities is needed (it not hard and a perfectly good system already exists).

The second factor is to logically apply id attribute ids to every publication (which can be achieved automatically with a simple script).

The problem with the text size is that device will change size overtime - my opinion is that the page number is basically dead as a reference point for electronic literature, milestone page numbers could be used but XML already supplies the means to elegantly deal with the problem.

I do not see that you have suggested a solution, only a method. So what do you count? I vote for paragraphs in standard books and stanzas in poems or lines in poems if appropriate.

GregS
12-12-2007, 06:15 PM
I'll admit to being a little surprised to see so much support here for hard coding the layout of eBooks. Given the myriad eReading devices that currently exist, a number which is only going to increase, surely it is pretty much impossible to code the layout of a single file such that it appears equally good on all eReaders? Are people really prepared to spend time reformating and hand-tuning their eBooks again and again as they upgrade their devices?

Currently yes. It s just impractical. But a PDF (fixed page) solution does ensure it will look good on every device if a means is made to generate it for each device as needed.

The problem simply does not really exist as a major concern for popular novels. But other works can and often do have special typographical elements that need to be treated with fidelity regardless of the device being used.

The problem can be solved without hands-on fine tuning for each device. In fact the problem can be reduced to just several broad contexts all based on relative size (mini, small and normal). 99% of the time nothing needs to be specially done, but exceptions have to be catered for and how each is treated in the context anticipated.

I thought HTML had taught us all about the disadvantages of integrating presentation instructions with content. If people are going to devote time to marking up books then I think they'd be better served marking up meaning rather than presentation. Defining your eBook in terms of logical book elements like title, author, chapters and footnotes makes more sense than adding font tags and layout tables.

This is assuming that we are talking about presentation being the same as file format. I agree content and presentation has to be separate, PDF files as PDF and nothing more are a disaster for flexible use of content. XML is ideal. XML within PDF is in my opinion half-baked, disguised and problematic (relying on too many hidden factors - and you still get stuck with a presentation that is basically unchangeable).

I side with you very strongly in this, but for me the problem remains.


I believe that presentation should be handled by software on each eReader - the reflowing of text, widow and orphan detection, hyphenation, etc. could therefore be tailored to the strengths and weaknesses of each device. CSS shoud be supported so the designer can exert some control over the display and provide an attractive default presentation but the reader should be able to override this with their own CSS files if they feel it necessary.

I have been waiting and waiting for CSS3, which is not a problem per se. As a stylesheet language it ideally solves all the presentation problems which presently haunt me.

The problem is, as has already been seen in HTML, CSS1 & 2, is that different implementations interpret differently, so unless there is a fundamental agreement on using the same code base everywhere, it is condemned to being an unreliable ideal. PDF for all that is wrong with it, presents exactly the same where ever it is displayed - that is its strength.

I am unfamiliar with TeX, but that may be another route already established.

The other factor is XML integrity for scholarly works, something which goes well beyond epub (a standard I strongly support). TEI is developed, does work (though it is a cow to employ) but it works well for this highly demanding area. Creating CSS in any form to do justice to the many features possible in this kind of markup is its own nightmare. CSS is great for relatively simple markup, in my opinion it crumples before the possibilities of something like TEI.

I've been looking at the epub spec recently and I found it supports the DAISY standard - which was originally designed as a way of marking up a book (in terms of t-o-c, chapters, appendices and so on) so that it can be rendered into braille or interpreted by text-to-speech engines. It's XML so it can be formatted into readable text (for display on an eReader) using CSS files included in the epub container.

What you suggest as a solution, let the devices sort it out, is for me the problem - though for novels and light reading in general this is fine. This specific combination that you recommend is an excellent one and I would be likewise encouraging publishers to follow it based on what you have said.

DRM has so narrowed the vision of some publishers they forget the inherent versatility of ebooks, braille and text-to-speech, "big print" compatibility (and printing and referencing) should be part of every publication.

I'd urge anybody interested in converting pBooks to eBooks to take a look at the DAISY (http://www.daisy.org/z3986/structure/structguide.htm#contents) standard. Consider the advantages of only having to mark up a book once such that it is readable on all epub devices and accessible to the visually impaired.

Thanks for the reference, I will be looking at this carefully, compatibility with epub and useful cross use markup is definitely the way to go for most texts.

HarryT
12-13-2007, 03:24 AM
BBeB has page numbers. And when you look at different versions of pbooks and editions, you can end up with different page numbering. If you are So basically, when you refer to page numbers on the ebook, you refer to the page number and text size and then you have it.

But it's NOT so easy on devices such as the CyBook where, not only do you have 12 available text sizes, but the user can install any font (family) they wish on the machine. Different fonts will produce different pagination.

GregS
12-13-2007, 05:16 AM
I do not see that you have suggested a solution, only a method. So what do you count? I vote for paragraphs in standard books and stanzas in poems or lines in poems if appropriate.

I missed your post earlier.

XMLnamespace.I.29

Chapter I paragraph 29 the <p id = "I.29">

For reference it is easy to go XMLnamespace.I.29/2-/3

Sentence 2 to 3 (inclusive) in para 29 Chapter I.

If the XMLnamespace is truly unique then that reference can located unambiguously.

The best thing is that IDs can be mixed.
XMLnamespace.I.29 [next element is an illustration]
XMLnamespace.illus.10
or as another form
XMLnamespace.1.illus1

Lexicon
12-13-2007, 03:16 PM
The problem is, as has already been seen in HTML, CSS1 & 2, is that different implementations interpret differently, so unless there is a fundamental agreement on using the same code base everywhere, it is condemned to being an unreliable ideal.

You seem to be saying that the problem here is that different devices display the same CSS differently. I don't believe that is the real problem here, the real problem is that designers expect the same kind of control of layout and presentation that they have traditionally wielded over printed materials.

I think designers need to loosen the reigns a bit, when it comes to reflowable text displayed on a vast range of devices (each with displays of differing sizes and aspect ratios) it's simply not possible to have fine grained control of appearance. Nor is it necessarily right that they should expect it - as the reader should I not have a say in how the information is presented to me? That wasn't even an option with print books so it was never an issue for the consumer, times change though.

Most of the decisions on how to render content should be made by the device instead of the book creator, and devices should succeed or fail according to how well they make those decisions. The presentation abilities of an eReading device should be as much a feature as how many books it can hold or the physical size of the screen. As consumers we should be looking at which device renders our content best - a weak hyphenation algorithm should draw as much criticism as poor reliability or bad ergonomics.

RWood
12-13-2007, 04:21 PM
Project Gutenberg (for some books) puts in the page numbers of the original books that they used as reference. An example would be "then Tom{p 23} said, "

While this removes the variation between devices, it requires the reference to a specific edition of the original book. It also makes reading fiction a lot less interesting.

GregS
12-13-2007, 05:37 PM
You seem to be saying that the problem here is that different devices display the same CSS differently. I don't believe that is the real problem here, the real problem is that designers expect the same kind of control of layout and presentation that they have traditionally wielded over printed materials...

Lexicon I am not suggesting that the vast majority of what is read should behave exactly as you propose. Indeed I am thrilled with epub as a much needed standard that acts exactly in this way. A few tools would deal with problems such as en-dashes, curly quotes and the rest. Widow and orphan control is no biggy and can be safely be left in the hands of display device software as it develops.

The real problem arises with non-fiction works where typographical elements are sometimes critical. It is a matter of horses for courses. Some works are simply better designed on a fixed page basis. PDF being based on a fixed media positioning language is proven in application.

It is early days, but I believe there is a way to generate; at the users end, from XML literature, mix and match stylesheets that produce PDF that fits the device being used, or indeed can produce epub type reflowed books.

This approach would be total overkill for most ebooks, epub and similar approaches work well, do their job, and are more than adequate. Plus they are relatively simple to create, which is important.

Think of the problem of producing a text, that requires a special type face (Ancient Sumerian), and has translated lines under each glyph that have to be kept in strict order regardless of the device being used. Add to this marginal notes, footnotes (which may be important to display on the same page at the time of reading and might occupy most of the page) and the simple CSS layout crumples. Even if CSS could do it, it would be a nightmare to create, and perhaps very slow to flow.

Reflowing text is a good thing, in general, but not always. I am proposing non-dynamic reflow that produced a fixed page ebook for different device display sizes. I am suggesting that it has a place, but not as a single universal standard, but as a potential special purpose standard.

Of course creating the tools and the stylesheet language capable of doing this is a long way off. As far as I can work out it is doable and the tools for constructing it are already developed, of course this is merely technical, design is critical if it were to be useful to book designers and users. This is for literature best marked up in something as complex as TEI, I emphasize "best marked up - not everything.

HarryT
12-14-2007, 04:13 AM
Project Gutenberg (for some books) puts in the page numbers of the original books that they used as reference. An example would be "then Tom{p 23} said, "

While this removes the variation between devices, it requires the reference to a specific edition of the original book. It also makes reading fiction a lot less interesting.

I always strip out those page numbers when I create books, because I find that they interfere with the reading experience. Very easy to get rid of them in BD with RegEx search and replace.

Patricia
12-14-2007, 05:44 AM
I always strip out those page numbers when I create books, because I find that they interfere with the reading experience. Very easy to get rid of them in BD with RegEx search and replace.

Harry, is there any chance of your adding instructions on how to do this to your excellent BD tutorials?

HarryT
12-14-2007, 07:24 AM
For a page number of the form:

[Pg xxx]

set the search string to "\[Pg [0-9]*\]" - that is, the character "[", then "Pg ", then a sequence of characters in the range 0-9, then another character "]".

Set the replace string as emply, and check the "RegEx" box on the search dialog.

GregS
12-14-2007, 12:47 PM
For a page number of the form:

[Pg xxx]

set the search string to "\[Pg [0-9]*\]" - that is, the character "[", then "Pg ", then a sequence of characters in the range 0-9, then another character "]".

Set the replace string as emply, and check the "RegEx" box on the search dialog.

Harry T, forgive my suggestion if it does not suit the format being discussed, but would not it be better to mark-up page numbers either as attributes (<pg number = "xxx"/>) or elements (<pg>xxx</pg>) that could be hidden by a style sheet?

JSWolf
12-14-2007, 01:34 PM
Can you actually tell the text size? Other than s,m,l which varies from book to book?

Dale
I'm not sure really.. But I'm guessing it goes up by 2 point sizes. But the problem is I don't have a reference size most of the time. So if small was 10 medium would be 12 and large would be 14.

JSWolf
12-14-2007, 01:35 PM
But it's NOT so easy on devices such as the CyBook where, not only do you have 12 available text sizes, but the user can install any font (family) they wish on the machine. Different fonts will produce different pagination.
Very true. Different fonts will have different results.

jeffw
12-15-2007, 05:57 PM
Hi gang,

I'm helping a friend with his eBook. Silly me I had no idea what I have agreed to do.

I really thought the eBook production process would be more standardized these days vs. a few years ago.

Silly me, I was hoping for a one fit all solution. But I have discovered (by reading this thread) that there are many paths possible and not one of them is even close to being easy, clean or simple

“I use this program to do this and then I use this program to do that and then I go and manually clean it up.” ... Ouch --- “This font face and font size for this device and this font face and font size for that ...” double ouch.

Though I'm a newbie to eBook publishing, I'm an old fart in PC cyberland. I go back to before VisiCalc. The problems or better said 'challenges' I see here …are all to common in our cyberland world.

I've seen it in Operating Systems, Apple DOS, IBM DOS, MS DOS, DR DOS, -- I've seen it in Word Processors, Spreadsheets, and Image Graphics. Sadly, its part of the product development process. Each company believing that their approach is the best and they are all correct -- but only to a certain extent. Sure they might do task A really great, but they fail at Task B.

But there is one main difference I see with eBooks vs. all the other devices and interrupters.

eBooks as a media, is still very much a child. People need the flexibility to bring their eBooks with them into the next generation of eBook electronic device readers.. The eBook you own and formatted for a Palm, should be cleanly readable on a Kindle, when you move over to the newest toy. Or even XYZ electronic device that comes out 3 years from now. I don't see this as a VHS to DVD issue. This is data, not storage media. Data is data.

For this reason I see the problem with the reader devices not the formatting software.

I would love to see a product ad like this.

The new STARR eBook reader from Borne comes with the following drivers/engines and can read the following eBook formats:

1) ePub
2) Mobipocket
3) Kindle
4) PDF A4
5) LIT
6) Sony Reader
7) iLiad
8) MS Word
9) HTML ver 1 to 6
10) Custom PDF
11) MS PP
12) Text to Speech (EN, FR, SP, IT....)

This would allow an open door for new Writer formats and 3 party (drivers) software. Drivers/engines could be tweaked by manufactures to work on their devices. The eBook device integration would come with choices of drivers/engines/ sort of like printer drivers. But instead of changing the driver to fit your equipment --- you would change the driver to the eBook format you own. Or maybe even a automated discover the format and auto change it to the proper one. (requiring very little knowledge or understating)

Whew, well there is my wish list and just in time for Santa to see it.

I’m just not sure if I my behavior was good enough this year for Santa to stop in here with gifts.

Thank you everyone for these posts. I learned more here in an hour than the six hours I spent reading the web last night.

kovidgoyal
12-15-2007, 06:02 PM
This thread is just an exercise in making mountains out of molehills. Just tell your friend to write directly in HTML or use editing software that outputs HTML and for all practical purposes he'll be just fine.

jeffw
12-15-2007, 06:11 PM
ROFL - WOW

thank you kovidgoyal I like (love) simple and flexible apporachs

Hadrien
12-15-2007, 06:13 PM
Yes, you are right, it does. In fact, that's the exact pattern for GREP or basic TEXT replacements. The draft document we posted has already had the single word "I[space]" replaced with "I[nonbreaking space]" to illustrate the point. We simply haven't yet run the rest of the bad-boy offenders through their paces yet.;)

As to the serif versus non-serif issue, again you are right. The common wisdom is to use non-serif fonts for screen displays and serif fonts for the text of printed displays. So, which category does the eBook fall into? Printed or screen? Or somewhere in between -- in which case, it is debatable which camp this technology falls into. We do, indeed, live in interesting times.

BTW, I see that Hadrien has responded with a post in this thread. I am most impressed with Feedbooks approach to this problem and their results are very good for the current state of the art. I have no doubt that as time goes on their algorithms will get better and better. At least I hope so! I love their approach to customizing the PDF creation with personalized fonts, margins, and headers/footers. Very user friendly. I can see a time when they will have customizable leading, hyphenation, paragraph styles, etc that will easily approach the kind of professional layout possible with InDesign -- but customizable on the fly. They are actually not that far from it now and clearly demonstrate a keen knowledge of the PDF architecture. So, if you are not familiar with their site, may I give them a plug?

Cusomizable leading, hyphenation and paragraph style could be possible too.

I'm wondering about something: currently, you're working on custom reg-exp in order to remove common typesetting mistakes. Are you planning on distributing these patterns ? It would be very interesting for everyone to create a database with all the useful patterns to fix common problems.
These could be used in any new software, and improve the overall look of every book available.

Maybe we should add a special wiki page for this ?

As for the Feedbooks approach: I believe that it's important to have choice and flexibility. For both the Cybook and iLiad, Feedbooks already offer 2 choice: Mobipocket for something flexible, yet not as nice from a typesetting point of view, or PDF files for better typesetting/presentation but without the ability to change the font on the device. As soon as Sony add Digital Editions support, it'll be the case too: epub for flexible formatting, PDF for fixed layout.

ath
12-17-2007, 11:28 AM
Yes, you are right, it does. In fact, that's the exact pattern for GREP or basic TEXT replacements. The draft document we posted has already had the single word "I[space]" replaced with "I[nonbreaking space]" to illustrate the point. We simply haven't yet run the rest of the bad-boy offenders through their paces yet.;)

As long as the non-breaking space stretches as shrinks in the same way as an ordinary spaces does, fine. If it doesn't, the text will look weird.

In english typography, I can't ever remember seeing a rule against leaving a single-character word (I or a) at the end of the line. Where does that come from?

pruss
12-17-2007, 09:16 PM
Just add in the other fonts needed to make a true font family. That problem will then be solved. Full justified would not be much of an isue if we have hyphenation support. And it would also fix the spacing with larger text sizes as well.

Hyphenation looks better, I guess. But a bit of searching with Google and Google Scholar suggests that there is evidence that it somewhat retards reading speed. I am guessing that this is going to be particularly an issue on small screens.

(One might think that hyphenation will reduce the number of page turns, and thus improve reading speed. But I did a little experiment in fbreader, and there doesn't seem to be a significant difference in the number of page turns.)

Nicer italics is nice. I also suspect that slanted text, while uglier, is read faster, but I don't have data.

I suspect curly quotes do improve readability and hence are a plus.

llasram
12-18-2007, 12:40 PM
Admittedly, I am new to the entire eBook thing, but I love the concept! I do not, however, love the aesthetics. But, I am a publisher, and perhaps my eye use just trained too well for the current state of the art.

Although I haven't had formal training/experience with book production, I have had many of the same problems you have with ebooks automatically reformatted from reflowable formats.

There are certain typesetting and layout aesthetic conventions in publishing that are routinely missing from the eBooks I have seen. For example . . .

Widows (a single line or word at the top of a page)
Straight quotes (rather than curled quotes)
Straight apostrophes
Simple words at the end of line (e.g., "I", "a", etc.)
Words too widely spaced to fit justification
Slanted fonts rather than true italics


. . . just to name a few of the most obvious.

I'd like to add a minor point about widows and orphans. I've found that PDF rendering on the Reader unfortunately is less legible than BBeB rendering at the same font size -- text in PDF files appears significantly lighter than text in BBeB files. To overcome this, in the PDF files I've generated for the Reader I've used primarily 11pt fonts. I use very tight margins and a short header, but that still only leaves room for a 22-line text block. With such a short text block, widows and orphans are a frequent occurrence, and eliminating them by creating artificially shorter pages results in so many shorter pages as to create a visually "ragged" look. So I've found just leaving the widows and orphans as-is to be the lesser of two evils.

Here is a draft I'm working on that addresses these issues. I'm using InDesign CS3 to create the layout design using Gutenberg text and then outputting as a PDF.

I've been doing something similar with LaTeX. I've been lazy about cleaning up my class file and auto-formatting scripts, so they aren't ready for prime time, but here's an example of the kind of output I'm getting: Eastern Standard Tribe by Cory Doctorow (http://platypope.org/files/est.pdf).

Benefit here is all open source tools :-). (Well, except for the font... I've been using Adobe Caslon for that, but that's easily enough changed.)

JSWolf
12-24-2007, 05:51 AM
I've been doing something similar with LaTeX. I've been lazy about cleaning up my class file and auto-formatting scripts, so they aren't ready for prime time, but here's an example of the kind of output I'm getting: Eastern Standard Tribe by Cory Doctorow (http://platypope.org/files/est.pdf).

Benefit here is all open source tools :-). (Well, except for the font... I've been using Adobe Caslon for that, but that's easily enough changed.)

Your PDF does look good. But there is one slight gotcha with it. Some of the chapters start in the middle of the page. It would (to me) be preferable if the Chapters all started on a new page.

ath
12-25-2007, 02:35 AM
Some of the chapters start in the middle of the page. It would (to me) be preferable if the Chapters all started on a new page.

Chapters start in many ways: from the very open chapter that starts on odd-numbered pages only, to the very close chapter that carries a minimal indication of new chapter (some of Folio Society's books do chapters by four or five lines of space, followed by the chapter number as 2- or 3-line drop caps.) The latter was usually used when there were very many and short chapters, with only minimal chapter indicators, and the 'chapters on new page only' treatment broke up the text more than was desirable ... or when its avoidance helped keep the page count, hence the printing costs, down.

In this particular case, the choice of chapter treatment does not appear to be out of line with respect to the book: chapters seem pretty short, so keeping 'formal chapter indicators' (such as new pages, text 'chapter', and this and that much white space) fairly low seems motivated.

Far too many widow lines, the apparently gratuitous mixing of very disparate typefaces in the first line of every chapter, as well as several very bad hyphenations, grate more on the eye, I think.

As this is an automatically produced book, only the typeface issue can be fixed. Bad hyphenations can be avoided, but it usually takes lot of manual work to do that, and widows almost always need manual tweaking of pages.