Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 10-09-2017, 05:49 AM   #1
AlanHK
Addict
AlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheese
 
AlanHK's Avatar
 
Posts: 344
Karma: 1002
Join Date: Apr 2014
Device: PW-3, Android phone
break/no-break and other spaces

I like to replace ellipses with spaced periods, since I don't like the usual ellipsis glyph, and it doesn't allow variations like . . . . or . . . ? or . . . !
Also to space between nested quotemarks, which otherwise look like a triple mark ’” but with space ’ ”.


I see some books just use a normal space, but that allows a linewrap to occur, which should never be.
So I have been using  
Aside from being no-break, otherwise it acts the same as a normal space; and so it stretches or compresses when the text is justified, and that sometimes looks odd.

I just looked at a Random House epub that used thin spaces:  
Which looks better I think. However, is it treated as a no-break space, in all formats -- epub and Kindle?


While looking into this, I found this list of 17 Unicode space characters:
http://www.fileformat.info/info/unic...ry/Zs/list.htm

U+0020 SPACE
U+00A0 NO-BREAK SPACE
U+1680 OGHAM SPACE MARK
U+2000 EN QUAD
U+2001 EM QUAD
U+2002 EN SPACE
U+2003 EM SPACE
U+2004 THREE-PER-EM SPACE
U+2005 FOUR-PER-EM SPACE
U+2006 SIX-PER-EM SPACE
U+2007 FIGURE SPACE
U+2008 PUNCTUATION SPACE
U+2009 THIN SPACE
U+200A HAIR SPACE
U+202F NARROW NO-BREAK SPACE
U+205F MEDIUM MATHEMATICAL SPACE
U+3000 IDEOGRAPHIC SPACE

Are all these valid in ebooks?
I assume that only the first two are elastic in size, is that correct?
And aside from nbsp, which are no-break?

Apologies if this is a FAQ, please link if there is such.

Last edited by AlanHK; 10-09-2017 at 05:51 AM.
AlanHK is offline   Reply With Quote
Advert
Old 10-09-2017, 08:27 AM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 50,098
Karma: 43266947
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Aura H2O, Sony PRS-650, Sony PRS-T1, nook STR, iPad 4, iPhone 5
If you make a sample ePub, I'll test it with ADE 2.0.1.
JSWolf is online now   Reply With Quote
Old 10-09-2017, 12:16 PM   #3
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,102
Karma: 480516
Join Date: Dec 2012
Device: Kindle
I don't like the glyph, either, and to my eye an ebook doesn't look right with spacing. I use space / three dots / space for an interrupted sentence, while ending sentences with four dots (i.e., one full stop following by the three for the ellipsis.

For the print edition, I revert to the traditional spacing, while ensuring an ellipsis is never broken at the end of a line.

Be aware that under your plan you would have to use a normal space following a three-dot ellipsis, for fear of forcing hyphenation where you don't want it.
Notjohn is offline   Reply With Quote
Old 10-09-2017, 12:44 PM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 50,098
Karma: 43266947
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Aura H2O, Sony PRS-650, Sony PRS-T1, nook STR, iPad 4, iPhone 5
I prefer the ellipse character with no space. That solves the problem.
JSWolf is online now   Reply With Quote
Old 10-09-2017, 02:26 PM   #5
RbnJrg
Guru
RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.
 
Posts: 786
Karma: 2739425
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
Try this:

1. In your .css stylesheet:

Code:
.nowrap {
    text-indent: 0;
    display: inline-block;
}
2. In your .xhtml file:

Code:
<p>Nullam ut massa rutrum dolor placerat tempor accumsan eget <span class="nowrap">purus.&thinsp;.&thinsp;.</span></p>
As you can see, you must include the word with ellipsis inside the class with the sytle "nowrap". It works fine with ADE 2.x, 3.x and 4.x.

Regards
Rubén
RbnJrg is offline   Reply With Quote
Old 10-09-2017, 03:41 PM   #6
AlanHK
Addict
AlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheeseAlanHK can extract oil from cheese
 
AlanHK's Avatar
 
Posts: 344
Karma: 1002
Join Date: Apr 2014
Device: PW-3, Android phone
Quote:
Originally Posted by Notjohn View Post
Be aware that under your plan you would have to use a normal space following a three-dot ellipsis, for fear of forcing hyphenation where you don't want it.
Yes, that's what I do.

Quote:
Originally Posted by RbnJrg View Post
Try this:
That would work, but at the cost of making the code more complex.

I guess if you think that's necessary then thinsp is otherwise a breaking space? I'll stay with nbsp if so.

Last edited by AlanHK; 10-09-2017 at 03:44 PM.
AlanHK is offline   Reply With Quote
Old 10-09-2017, 05:42 PM   #7
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 9,196
Karma: 7797387
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Whitespace character - Wikipedia

BR
BetterRed is offline   Reply With Quote
Old 10-09-2017, 07:04 PM   #8
RbnJrg
Guru
RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.
 
Posts: 786
Karma: 2739425
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
Quote:
Originally Posted by AlanHK View Post
Yes, that's what I do.


That would work, but at the cost of making the code more complex.

I guess if you think that's necessary then thinsp is otherwise a breaking space?
Yes, thinsp is a breaking space. But if you work in Sigil, then you can create a clip for the ellipsis (with thinsp periods) and the the word to be enclosed together. Then you could apply the code in a blink (just a click of the mouse).
RbnJrg is offline   Reply With Quote
Old 10-09-2017, 07:26 PM   #9
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 9,196
Karma: 7797387
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by RbnJrg View Post
Yes, thinsp is a breaking space. But if you work in Sigil, then you can create a clip for the ellipsis (with thinsp periods) and the the word to be enclosed together. Then you could apply the code in a blink (just a click of the mouse).
Curious - could you use Figure space or Non breaking thin space between the dots. I've used the former on things like telephone number or part numbers - in blog posts

BR

Last edited by BetterRed; 10-09-2017 at 07:34 PM. Reason: added last phrase
BetterRed is offline   Reply With Quote
Old 10-09-2017, 08:54 PM   #10
RbnJrg
Guru
RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.
 
Posts: 786
Karma: 2739425
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
Quote:
Originally Posted by BetterRed View Post
Curious - could you use Figure space or Non breaking thin space between the dots. I've used the former on things like telephone number or part numbers - in blog posts

BR
Even with non breaking thin space between the dots, you need to enclose the dots with the preceding word (by applying the respective style) to avoid things like

Code:
some words here
...
With the style "display: inline-block;" you would get

Code:
some words
here...
Regards
Rubén
RbnJrg is offline   Reply With Quote
Old 10-10-2017, 01:35 AM   #11
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 9,196
Karma: 7797387
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by RbnJrg View Post
Even with non breaking thin space between the dots, you need to enclose the dots with the preceding word (by applying the respective style) to avoid things like

Code:
some words here
...
With the style "display: inline-block;" you would get

Code:
some words
here...
Regards
Rubén
gotcha - ta

BR
BetterRed is offline   Reply With Quote
Old 10-10-2017, 06:35 AM   #12
Tex2002ans
Guru
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 889
Karma: 5454725
Join Date: Jul 2012
Device: Nook
Quote:
Originally Posted by AlanHK View Post
I like to replace ellipses with spaced periods, since I don't like the usual ellipsis glyph, and it doesn't allow variations like . . . . or . . . ? or . . . !
I agree. These edge cases are why I also avoid using the ellipsis character.

It is also too common to run across fonts where the three-periods+spacing in the ellipsis looks vastly different than the single period:

.… (PERIOD + ELLIPSIS)
.... (FOUR PERIODS)

Arial Narrow
.…
....

Courier New
.…
....

Garamond
.…
....

Verdana
.…
....

Georgia
.…
....


Quote:
Originally Posted by AlanHK View Post
Also to space between nested quotemarks, which otherwise look like a triple mark ’” but with space ’ ”.
I see some books just use a normal space, but that allows a linewrap to occur, which should never be.
So I have been using &nbsp;
Aside from being no-break, otherwise it acts the same as a normal space; and so it stretches or compresses when the text is justified, and that sometimes looks odd.

I just looked at a Random House epub that used thin spaces: &thinsp;
Which looks better I think. However, is it treated as a no-break space, in all formats -- epub and Kindle?
  • Typographically, the correct space in this between inner/outer quotes would be a THIN SPACE (or more rarely, a HAIR SPACE).
  • Depending on the tools at hand, it might be better/easier to use a NO-BREAK SPACE. For maximum compatibility, this is the choice to go with.
  • Ultimately, that minor spacing issue would be something handled by kerning tables in the fonts themselves OR handled by the rendering software. So your source would say ’” and the renderer would pop out ’ ”.

Side Note: Things also get more complicated with language-/country-specific rules. For example, in French, they may use a NARROW NO-BREAK SPACE between opening/closing guillemets... but in Canadian French, a THIN SPACE. (See for example, LibreOffice's article explaining substituting in more compatible spaces, "Non Breaking Spaces Before Punctuation In French")

Quote:
Originally Posted by AlanHK View Post
While looking into this, I found this list of 17 Unicode space characters:
http://www.fileformat.info/info/unic...ry/Zs/list.htm

[...]

Are all these valid in ebooks?
Not really. The most supported whitespace would be SPACE + NO-BREAK SPACE. Anything outside of that will be in less fonts, and may be more prone to trouble (either getting the "missing font glyphs" or not rendering properly).

The next most common character would probably be the THIN SPACE, because that is officially used in a heck of a lot of languages (French). But again, may not render/display properly, so a NO-BREAK SPACE is a valid substitute.

The usage of the many of those other "fixed-width spaces" like the EN QUAD, EM QUAD, EN SPACE, TWO-EM QUAD, [...] were mostly used for backwards compatibility with Xerox's standard character encoding... these SHOULD NOT be used for manual spacing in modern documents.

Side Note: The only time these would be used in modern documents is in the VERY RARE case of Mathematics. See this fantastic post on the LaTeX Stack Exchange about using the proper spacing in Mathematics (also references the fantastic book, "Mathematics into Type").

Side Note #2: The fixed-width spaces were also measurements way back when things were manually typeset (think shoving metal boxes onto a rod). Putting them into documents now would be like manually typing pressing enter at the end of each line. It is POSSIBLE, but extremely unrecommended. :P Would probably cause a lot more harm than good.

Side Note #2.5: Hmmm... I would also be interested to test Text-to-Speech and see if these weird spaces might confuse it.

Quote:
Originally Posted by AlanHK View Post
And aside from nbsp, which are no-break?
These are considered No-Break:

NO-BREAK SPACE
NARROW NO-BREAK SPACE

See "Unicode Line Breaking Algorithm" (Unicode Standard Annex #14):

https://www.unicode.org/reports/tr14/

(For example, another non-breaking space is the FIGURE SPACE.)

If you take a look at Table 1, they give all the line-breaking categories + recommended rules. And breakdowns of each category.

But these are RECOMMENDATIONS, that isn't what the renderers WILL do. For example, if you take a look at my Post #48, I came up with 3 test cases that broke a THIN SPACE differently. I didn't test on ereaders specifically, but I did test on Word/LibreOffice/Notepad++, InDesign, Firefox/Chrome/IE. Some rendered it as non-breaking, others rendered it as breaking, and others added a break between punctuation, others did not. I bet ereaders are an even more giant mess when dealing with these rarer spaces.

Quote:
Originally Posted by AlanHK View Post
I assume that only the first two are elastic in size, is that correct?
Generally correct. To quote the "Unicode Line Breaking Algorithm" above:

Quote:
Originally Posted by AlanHK View Post
When expanding or compressing interword space according to common typographical practice, only the spaces marked by U+0020 SPACE and U+00A0 NO-BREAK SPACE are subject to compression, and only spaces marked by U+0020 SPACE, U+00A0 NO-BREAK SPACE, and occasionally spaces marked by U+2009 THIN SPACE are subject to expansion. All other space characters normally have fixed width. When expanding or compressing intercharacter space, the presence of U+200B ZERO WIDTH SPACE or U+2060 WORD JOINER is always ignored.
or a different part of the Unicode standard:

Quote:
The fixed-width space characters (U+2000..U+200A) are derived from conventional (hot lead) typography. Algorithmic kerning and justification in computerized typography do not use these characters. However, where they are used, as, for example, in typesetting mathematical formulae, their width is generally font-specified, and they typically do not expand during justification. The exception is U+2009 THIN SPACE, which sometimes gets adjusted.
... but you always have odd cases (like Monospaced fonts)... or fonts that don't have correct spaces... or cases where other layers above which may take priority over Unicode itself (like CSS or font kerning).

Last edited by Tex2002ans; 10-10-2017 at 07:20 AM.
Tex2002ans is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Do you use page-break-after and page-break-inside? fluoresce ePub 14 05-24-2017 02:57 AM
Troubleshooting Did I break my K2? joh777nny Amazon Kindle 2 02-28-2014 09:32 PM
Plague of no-break-spaces (&nbsp;) townsend Sigil 21 04-11-2013 10:43 AM
Uh oh, did I just break my KF? wyndslash Kindle Fire 10 09-07-2012 05:33 PM


All times are GMT -4. The time now is 07:11 AM.


MobileRead.com is a privately owned, operated and funded community.