View Full Version : HTML/CSS for (German) s p a c e d o u t emphasis


frabjous
04-04-2010, 11:55 PM
I'm working on digitalizing, and converting into a variety of formats, an old public domain text which contains both German text, and the translation thereof into English. The German side, rather than italics, uses wide s p a c i n g for emphasis.

I have a version in LaTeX format that works well, using the soul (http://ctan.org/pkg/soul) package for this purpose and it works fine for creating PDF versions. But of course I want others.

I've done most of the work converting this to XHTML and the version I currently have I'm using the CSS letter-spacing property as follows:


.spacedout {
letter-spacing: 0.3em;
}


and then in the text

Wird ein Zeichen nicht <span class="spacedout">gebraucht</span>, so ist es bedeutungslos.


But there are two problems with this.

1. It seems to be handled differently, and non-optimally, by most browsers, and calibre's viewer. Most, for example, seem to generate:


Wird ein Zeichen nicht g e b r a u c h t , so ist es bedeutungslos.


thus adding space after every character in the span, including the last one, in this case, adding an unnecessary and unsightly space before the comma.

If this were uniform for all browsers, I'd consider using a negative margin or padding amount at the end of the span to close that gap, but some browsers seem to generate:


Wird ein Zeichen nicht g e b r a u c h t, so ist es bedeutungslos.


instead already. (Which also gives me reason not to end the span one character early, which would be sloppy coding anyway.)

But more importantly, this property is not supported in the ePub spec, and while some ePub viewers (such as Calibre's) support it anyway, ADE doesn't, which makes it pretty useless inside an ePub.

So what are my options?
I could insert non-breaking spaces between the letters, but this seems messy, and would throw off dictionaries, searches, word counts, etc.

I could put tags around each letter and use margins or padding to push them apart. But this would be a pain, and may have similar drawbacks to the above.

Can anyone think of anything thing else?

Jellby
04-05-2010, 08:24 AM
For ePUB, I would forget about it, at least until it is supported in the spec. Meanwhile, use other methods for emphasis, like italics, bold, underline... Of course, once it is supported, you can easily change the CSS.

Regarding "workarounds" I'd avoid them, code the style for browsers/readers that work properly (not adding a space after the last letter). At most, you could use ADE's conditional styles for it.

Steven Lyle Jordan
04-05-2010, 09:26 AM
I would simply convert the text to ePub-compliant italics, unless there is some particular reason you can't simply use italics (or bolding) for emphasis. Is it exclusively a cultural thing, or does it somehow impact the output or ability to read the document? Does it distort the meaning? If not: When in doubt, go for compliance, I always say. (And contact the IDPF, if you think this is something they should address.)

Alternatively, you could try inserting a d-a-s-h after each character (except the last, which would solve your comma issue). I don't know how screen readers, dictionaries, etc, would be able to handle that, though.

frabjous
04-05-2010, 10:47 AM
Thanks for the suggestions. This isn't my work; it's a philosophical "classic", so to speak (Wittgenstein's Tractatus). My target audience are academics who are very fussy about maintaining the original as best as possible, so the choice of style isn't really mine to make. Indeed, the reason for the mixed German/English is to allow for easy hyperlinking between the original German and the English so that the translation can be scrutinized and checked. No doubt someone already has, or will soon, write a paper about the exact pattern of emphasis in the book.

But if ePub won't do it properly, my hands are tied. I guess I'd have to switch it italics, and then just add a big bold note at the beginning of the ePub making note of the change so no one complains.

Steven Lyle Jordan
04-05-2010, 11:20 AM
This isn't my work; it's a philosophical "classic", so to speak (Wittgenstein's Tractatus)...

But if ePub won't do it properly, my hands are tied. I guess I'd have to switch it italics, and then just add a big bold note at the beginning of the ePub making note of the change so no one complains.

Yeah, I had a feeling it might be like that. I agree with switching to italics and placing a note before the text in explanation ("Hey, it's ePub's fault, not mine!"), so they'll understand why you did it--even if they don't fully appreciate it.

Jellby
04-05-2010, 11:20 AM
I believe German uses spaced-out letters because traditionally it was written with blackletter type, and it's tricky to use italic or bold face with that. The Wikipedia article for "emphasis" (http://en.wikipedia.org/wiki/Emphasis_(typography)#Letterspacing) confirms my belief. I have, however seen spaced-out text with roman type too, in a Swedish book, for instance.

Now that I think of it, it's probably a good idea to add spaces before and after the spaced-out text, at least if there are no punctuation signs there, as in this German example (http://de.wikipedia.org/wiki/Sperrsatz).

The Distributed Proofreaders wiki gives this advice:

em.gesperrt {
font-style: normal;
font-weight: normal;
letter-spacing: .2em;
padding-left: .2em;
}

(exact letter spacing to taste). The padding is so gesperrt words won't be off-center. Further complications arise when a paragraph starts with a gesperrt word, or in the rare browsers that handle letter-spacing differently, but this will work in most situations.

Still, your problem is the ePUB spec not supporting letter-spacing.

frabjous
04-05-2010, 03:37 PM
Well, even if it won't work in the ePub, the padding-left suggestion is a good one for the pure (x)html version I'm planning on distributing as well. Balanced is best. I can probably check easily enough if it begins a sentence. As for browser differences, I think just IE is different (as is so often the case).

charleski
04-05-2010, 05:01 PM
If you have access to a font-manipulation program, create a new font with extra spacing around the characters and then embed that. You should be able to solve the punctuation problem by shifting the offset of commas and full-stops etc so that they will align correctly with the preceding letter.

Letter-spacing is rare enough in English that I just wrap characters in their own spans for this sort of stuff.

frabjous
04-05-2010, 08:16 PM
Hmm. That extra embedded font thing is an interesting idea. I have fontforge installed, but the only things that I've ever used it for are converting font formats, and extracting a certain glyph to SVG. I have no idea how difficult it would be to change the spacing in a font, but it might be worth a try. Thanks for the suggestion!

charleski
04-06-2010, 07:50 AM
A quick look at the fontforge docs shows it has commands to set the left and right sidebearings. You'd just need to go through the characters and select the 'Set LBearing' and 'Set RBearing' commands from the Metrics menu. For punctuation coming after a character (full-stop, comma, etc) you'd then decrease the LBearing value by a similar amount.

Jellby
04-06-2010, 08:36 AM
The CSS2 specification (http://www.w3.org/TR/CSS2/text.html#spacing-props) does not mention what should happen with the space before/after a block marked with letter-spacing, it even says "character spacing algorithms are user agent-dependent", so it's a sort of gray area. What it says, however is "when the resultant space between two characters is not the same as the default space, user agents should not use ligatures", this is against the usual practice in German, as in the example I linked to in post #6. Of course, ligatures are rarely used by XHTML renderers anyway :rolleyes:

Did you try with Prince, frabjous?

frabjous
04-06-2010, 12:37 PM
A quick look at the fontforge docs shows it has commands to set the left and right sidebearings. You'd just need to go through the characters and select the 'Set LBearing' and 'Set RBearing' commands from the Metrics menu. For punctuation coming after a character (full-stop, comma, etc) you'd then decrease the LBearing value by a similar amount.

Thanks! I'll try it out when I get a chance.


Did you try with Prince, frabjous?

No. Do you mean just to see how it handles the letter-spacing attribute? I mentioned in the opening post that I began with a LaTeX version of the file, and so I already have a PDF version I'm happy with.

Newer versions of Firefox do automatic ligature substitution with opentype fonts. I'll check how it handles it, but I'm fairly sure it does it properly, or I'd have noticed by now.

Jellby
04-06-2010, 03:07 PM
Newer versions of Firefox do automatic ligature substitution with opentype fonts. I'll check how it handles it, but I'm fairly sure it does it properly, or I'd have noticed by now.

Do you mean "properly", as the CSS spec says (no ligatures in spaced-out text), or as Germans like their texts (keep ligatures anyway)? My point is that maybe the letter-spacing is not the typographically optimal option, maybe it is better to substitute it by italics or similar...

It just occurred to me trying with a <span> around every letter (or ligature), and styling this span with left and right margins... that should work pretty much everywhere, I don't think the <span>s add breakpoints or disturb text searching. This, of course, is only worth doing if there are just a few instances of spaced-out text.

frabjous
04-06-2010, 06:52 PM
Do you mean "properly", as the CSS spec says (no ligatures in spaced-out text), or as Germans like their texts (keep ligatures anyway)? My point is that maybe the letter-spacing is not the typographically optimal option, maybe it is better to substitute it by italics or similar...

I meant per the CSS spec. I'd only want them kept together if I were using blackletter fonts, but I'm not (nor did my original), and if I were I doubt I'd trust the renderer with the ligatures. I'd probably swap in the glyphs in the actual document, searchability be damned. Luckily that's not an issue.

The original publisher, even of the German text, was in England, so maybe that's why. Or maybe it was just past that period.

But apparently I was wrong about how smart Firefox is about this, although the results were strange. Testing this document:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<style type="text/css">

body
{
font-family: 'Sorts Mill Goudy';
font-size: 200%;
letter-spacing: 0.2em;
}

</style>
<title>LigTest</title>
</head>
<body>
My affinity for affluent fish living in fjords offended the flora.
</body>
</html>


I got different results depending on the font-size. (And not just the font-size selected by the CSS, but it even changed when I used Ctrl-minus and Ctrl-= to zoom in and out.)

Have a look see.

http://www.mobileread.com/forums/attachment.php?attachmentid=49358&stc=1&d=1270594304

versus

http://www.mobileread.com/forums/attachment.php?attachmentid=49359&stc=1&d=1270594304

(This is from the same file, just zooming in and out.)

It just occurred to me trying with a <span> around every letter (or ligature), and styling this span with left and right margins... that should work pretty much everywhere, I don't think the <span>s add breakpoints or disturb text searching. This, of course, is only worth doing if there are just a few instances of spaced-out text.

I tested the suggestion of using separate span tags with margin-left and margin-right, and the results were better than I expected. Every viewer I tried was smart enough not to break at the wrong point, and the text was still searchable. Unfortunately, I have hundreds of emphasized passages in the document. It shouldn't be too hard to come up with a RegEx to insert them so I don't have to do them by hand, but it's still going to end up with messy code. I'll have to compare what adds more to my filesize too: inserting all these, or embedding another font per charleski's suggestion.

Jellby
04-07-2010, 04:35 AM
I'll have to compare what adds more to my filesize too: inserting all these, or embedding another font per charleski's suggestion.

Embedding another font is probably more robust, and "cleaner", it's like adding another font family/variant next to italic, bold, etc. The downside is it works only if a device supports embedded fonts, and unless you also define (and embed) a font for the main text, you risk the spaced-out font being wildly different from the main one (and the user can't change it).

However, in the case you need it for blackletter fragments, which themselves include normal and emphasized text, I think the embedded font would be the way, and I'd define a spaced-out variant as the "italic" of the blackletter. If the German text uses a different font (even if it's not blackletter), that would be a way too.

frabjous
04-07-2010, 11:20 PM
I made a modified version of TeX Gyre Schola (http://www.gust.org.pl/projects/e-foundry/tex-gyre/schola/index_html) (LPPL) in fontforge, and embedded both the regular version and the modified version in an ePub. Editing the font was easy, since you can set the LBearing/RBearing of an entire span of glyphs in the font with a single command in fontforge. So it only took a couple minutes.

Here's the result. Looks pretty good. There's a minor problem with words at the end of the beginning of the line--see the k e i n e at the end of line 2 and U r s a c h e at the beginning of line 3 and how they're not flush with the edges, but I'm fairly sure I'd have the same problem with spans around each character and adding to the margin. I think I can live with that.

http://www.mobileread.com/forums/attachment.php?attachmentid=49442&stc=1&d=1270696659

The real problem is with ePub renderers that don't support embedded fonts. I gather this includes the damned iPad! I suppose I could define this font as the italic version of a new font family, so if it has to fall back, it'll fall back on italics, but I seem to remember ADE having a problem whereby it would oblique-ify already italic fonts if they're set as font-style: italic;, resulting in an ugly mess--is my memory off on that?

Thanks again to Charleski for the idea...

Jellby
04-08-2010, 05:59 AM
but I seem to remember ADE having a problem whereby it would oblique-ify already italic fonts if they're set as font-style: italic;, resulting in an ugly mess--is my memory off on that?

It did the equivalent thing with bold fonts last time I tried, so I bet you are right.

charleski
04-08-2010, 08:35 AM
I seem to remember ADE having a problem whereby it would oblique-ify already italic fonts if they're set as font-style: italic;, resulting in an ugly mess--is my memory off on that?

You certainly frightened me there, but it seems this is not the case, thankfully. Here's a line from a book which has Bodoni Twelve OS Book Italic embedded and defined as font-style: italic.

http://www.mobileread.com/forums/attachment.php?attachmentid=49456

It doesn't look as if ADE is adding any extra angle to the axis, at least in the desktop version. I think you should be safe using standard italic as a fall-back for readers that take the short bus to school, like iBooks (hopefully this was a result of rushing the hardware out before having the OS properly in place and version 4 will fix it).

I hadn't anticipated the problem with line endings. I don't think it looks too bad in the context. You could probably fix it by manipulating the font using the kerning table instead of the sidebearings, so that extra space would be inserted between letters but not between a letter and a space and the size of the letter would remain unchanged. This would be a lot of work though, and you'd have to trust that the reader software will use the kerning table properly.

Valloric
04-14-2010, 07:49 AM
but I seem to remember ADE having a problem whereby it would oblique-ify already italic fonts if they're set as font-style: italic;, resulting in an ugly mess--is my memory off on that?

I think you may be confusing it with PrinceXML, which does have this problem (or something similar). Jellby and I talked about it in his epud2pdf thread (http://www.mobileread.com/forums/showthread.php?t=62939).

frabjous
04-14-2010, 09:18 AM
Yeah, that could be it. Anyway, everything seems to working just fine for me with the spaced out font (apart from the slight edges issue mentioned earlier which I can live with).

The only problem I'm having now is that embedding this font forces me to embed the corresponding font for the main text, along with bold/italics/bolditalics versions for it, which generally is fine, except that calibre's viewer won't support multiple font variants from the same family (bug with qt webkit apparently). But I'm not sure too many people use calibre's viewer.

GMcG
05-15-2011, 04:08 AM
My understanding is, that spacing means a space, following a letter.
So the last letter of the g e s p e r r t word will get a space too, creating the spacing to the following comma.
You should exclude the last letter of the word. It is spaced to the preceeding letter, because that letter has a following spacing, but there is no spacing between last letter and comma.

Instead of
<span style="letter-spacing:3pt">example</span>, xxxxxx
use
<span style="letter-spacing:3pt">exampl</span>e, xxxxxx

George

frabjous
05-16-2011, 04:52 PM
Read the first post: ADE doesn't even support the letter-spacing CSS property.