![]() |
#1 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,986
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Non-breaking space
I thought I had seen it all when it comes to bad practices making an eBook. Well, today I found something I have yet to (knowingly) come across. I've come across an eBook where there are non-breaking spaces in place of spaces that throws off the formatting and makes nice big gaps between words.
I kind of knew it had to be non-breaking spaces when the gaps were big enough to fit the word that started the next line without a problem. I then had a look at the XML file and yes, there are non-breaking spaces there. So why would non-breaking spaces be put in an eBook where they have no need to be there? This is a publisher original and not some download some unknown source. |
![]() |
![]() |
![]() |
#2 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Some crappy automatic conversion, maybe a layout-preserving OCR?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,986
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
No idea. But when I was reading, the gaps did show up very easily. More people making eBooks who have no clue how to do it. They also include covers that seems like they are scaled up in size or are very pixelated looking at least for the first two books in the series. This is the George Gently series by Alan Hunter. I've been watching the BBC TV series on PBS and finally decided to give the books a go.
|
![]() |
![]() |
![]() |
#4 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Were these occurring towards the the end of paragraphs only? Or were the just spread out willy nilly with no rhyme or reason?
A few months back in my LaTeX research, I stumbled upon this odd question about the "minimum length of the last line of a paragraph": https://tex.stackexchange.com/questi...of-a-paragraph There was also a similar idea for poetry for not wanting a single word on a line by itself, and trying to keep two or three words together if the line split: Code:
<p>This is just a sample sentence that ends too soon.</p> Certain languages/countries/style guides might have certain odd typographic rules as well. I stumbled upon this too: Quote:
Could also just be some leftover typographic crud from InDesign/Quark output as well. The good ol' "we can just export this as EPUB because it has a 'Save As EPUB'!" Does any metadata give any hints about the creation program? Anything you can deduce from the CSS/class names? We need some examples and DETAILS! Last edited by Tex2002ans; 07-23-2014 at 07:37 PM. |
|
![]() |
![]() |
![]() |
#5 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,986
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The non-breaking spaces looked to be placed without any rhyme or reason.
Here are some samples. The bolding is mine. Code:
<p id="rw-p_287847-00003">‘He went back to the fair after he’d been here,’ continued the superintendent. ‘He had tea with his wife in his caravan and did his stunt at 6.15. He was due to do it again at 6.45. I had men there at 6.35, but he’d disappeared. The last person to see him was the mechanic who looks after the machines.’</p> <p id="rw-p_287847-00010">‘Then for heaven’s sake why didn’t you grab him?’ snapped Hansom.</p> <p id="rw-p_287847-00012">Hansom snarled disgustedly. The superintendent brooded for a moment. ‘I don’t think there’s much doubt left that he’s our man,’ he said. ‘It looks as though we shan’t be needing you after all, Gently. I think we shall be able to pin something on young Huysmann and make it stick.’</p> <p id="rw-p_287847-00021">‘You might print the door handle and the back of the chair that stands just inside,’ continued Gently, ‘and photograph the marks left on the carpet. Then again,’ he turned his thumb back with slow care, ‘you might wonder to yourself how the knife came to be in the chest in the hall. I can’t help you in the slightest. I’m still wondering myself …’</p> Last edited by JSWolf; 07-23-2014 at 08:06 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
It looks like they appear at quite constant distances. I'd say that's where linebreaks occur in the printed version, and they were converted to . A reason to never buy something from that publisher.
|
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,258
Karma: 3439432
Join Date: Feb 2008
Device: Amazon Kindle Paperwhite (300ppi), Samsung Galaxy Book 12
|
Probably the non-breaking spaces were inserted to influence how the line breaks --- I really wish there were a ``discretionary non-breaking space'' which would only be used to keep the preceding and following word from not hyphenating and encourage keeping them together (one can program that sort of thing in LaTeX, but it's not an option until some engineer at Adobe or Quark or Microsoft or Apple works it up elsewhere).
|
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,413
Karma: 13369310
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
|
This may be off topic but for what it is worth I really don't like to see single ndashes or right quotation marks on a line by themselves; it just looks ugly to me. Most of the ebooks I do were written in the 19th century, and it was quite common to see something like:
'Lorem ipsum dolor sit - ' which might end up as 'Lorem ipsum dolor sit - ' or (even worse) Lorem ipsum dolor sit - ' I end to use Lorem ipsum dolor sit –’ which at worse displays as Lorem ipsum dolor sit –’ Any comments? |
![]() |
![]() |
![]() |
#9 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,986
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I would take
Code:
'Lorem ipsum dolor sit - ' Code:
'Lorem ipsum dolor sit—' |
![]() |
![]() |
![]() |
#10 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
![]() https://en.wikipedia.org/wiki/En_das...d_substitution But, if you still want to keep the Set Open (as some languages require), the is probably the best way to handle it in EPUB since the more complex spaces (hair space, thin space, etc. etc.) are not supported very well on devices. I don't have any Regex on hand to handle all the situations (since I Set Close all em/en dashes), but it shouldn't be too hard to come up with a bunch of Regex to substitute spacing around dashes to Set Open. I use this to Regex to Set Close all Em Dashes: Search: [ ]*—[ ]* Replace: — I am also not too sure if it would be valid to only have only BEFORE the Set Open en dash, and not BEFORE+AFTER (this will make sure the device doesn't see both words combined into one very large word, and mangle the justification algorithm). Same sort of thing happens in certain languages with spacing before/after other punctuation. I was converting a French Canadian book once, and the typographic rules made my head hurt, and created ugly code. ![]() Here is a paragraph from the French Canadian book: Quote:
Side Note: Alex, I sent you an email a few months back, never got a response, did you change your email or something? Maybe it got lost in the spam folder, I did attach A TON of stuff to get your input/ideas on. Last edited by Tex2002ans; 07-25-2014 at 01:17 AM. |
||
![]() |
![]() |
![]() |
#11 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
In English books, I use " – " in the middle of a sentence and "—" (no spaces) at the beginning or end:
Bla – foo – bar, ‘Nevermore—’ quoth the raven. An alternate style would be: " – " -> "—" "&mdash" -> "——" Quote:
For best effect, use & #8239; (narrow no-break space). |
|
![]() |
![]() |
![]() |
#12 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
This style is very common in English books printed about 100 years ago, but not common in American books. Makes a bit of a quandary when converting them.
|
![]() |
![]() |
![]() |
#13 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,986
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I'm of the no spaces around em-dash and no spaces around ellipses. It just looks off to have the spaces. Also, as for the em-dash vs. the n-dash, I don't care for the n-dash and prefer to go with the em-dash.
|
![]() |
![]() |
![]() |
#14 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Personally, I think the best bet is to join the punctuation (emdashes, endashes and ellipses being the big offenders) to the prior word, and leave a space after. This gives you a "split the baby" approach. I find an emdash starting a new line, or an ellipsis, far odder than ending one. FWIW; everyone has their own opinions on this topic. Hitch |
|
![]() |
![]() |
![]() |
#15 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,413
Karma: 13369310
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
|
Quote:
And no, I have no recollection of getting a long email from anyone relating to ebooks. I'm sure I would have at least acknowledged it even if I didn't deal with it in detail. Apologies anyway, and could you send it again? And let me know by private mail that it's been sent? |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Pages Not Breaking | msmith65 | Sigil | 8 | 08-21-2012 12:32 PM |
Space Captain Smith of the British Space Empire | Kacetwo | Deals and Resources (No Self-Promotion or Affiliate Links) | 4 | 07-02-2012 03:41 AM |
Breaking DRM | maurices5000 | Sony Reader | 40 | 02-02-2011 07:14 PM |
Non breaking spaces? | troymc | Sigil | 6 | 05-22-2010 07:47 AM |