|
|
Thread Tools | Search this Thread |
02-22-2013, 09:44 AM | #1 |
Junior Member
Posts: 6
Karma: 10
Join Date: Oct 2008
Device: none
|
[old thread] non breaking spaces (* and  ) automatically removed
Hello,
I have a really big problem with non breaking spaces, that are not written as " " but as "& #160;" or " ", what - as far as I understand it - is just the decimal or hexadecimal way of writing a non breaking space. In the 0.5x.0 versions Sigil automatically replaced these with " " what was ok, since it was the same thing. But in the 0.6 and now the 0.7. version Sigil just replaces them with "normal" spaces. When you use non breaking spaces for your layout, this is a big problem, since in HTML more than space is just treated as one single space. Is there a way to disable this behaviour? I have to work with EPUBs that were created not by me, EPUB files, that have these non breaking spaces in it. I tried to uncheck the checkboxes in the preferences "Automatically clean and format html source", but that does not work. Sigil does that replacement when I open the EPUB, so there is no way of getting around it, by replacing the spaces myself with a search/replace or something like that. Does anyone have an idea, because that bug (or feature?) makes it impossible for me to use newer Sigil versions and I have to use the old 0.5.3 version :-( Thanks and bye Artoros |
02-22-2013, 11:44 AM | #2 |
Sigil developer
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
|
I can see the issue. * is converted to a space instead of being left as * or converted to . This is primarily an issue with nbsp since its special in that Book View always changes the actual nbsp character to a space - so we have code that converts the nbsp character to the entity. Need to check if we need to do the same for * or handle in a different way.
|
Advert | |
|
02-27-2013, 04:46 AM | #3 |
Guru
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
|
On a related note I am getting errors with non-breaking spaces when validating books - the error reported is
Code:
entity 'nbsp' not found When I examine the html the error points to various entries such as : Code:
‘But . . . I don’t—’ Which looks valid HTML to me. While I can get rid of the "errors" by cleaning the file I can't see why it needs cleaning. BobC Last edited by BobC; 02-27-2013 at 02:24 PM. Reason: Clean up error message |
02-27-2013, 02:57 PM | #4 |
Sigil developer
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
|
This usually indicates a problem in your header's document type. The type is defined for something that doesn't know how to deal with an & nbsp ; entity. Just compare the header from the uncleaned and the cleaned versions to see the difference.
|
08-28-2014, 06:58 PM | #5 |
Connoisseur
Posts: 68
Karma: 526028
Join Date: Nov 2009
Location: New York, NY
Device: iphone
|
Is there something that I can "switch off" to prevent my nonbreaking spaces from being turned into regular old spaces?
|
Advert | |
|
08-28-2014, 09:59 PM | #6 | |
Grand Sorcerer
Posts: 27,595
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
The character will always be converted to an entity. It can't survive the Qt Widget being used (it will be changed to a normal space), so Sigil converts it to an entity first (when opening the epub) so that it can still fulfill its non-breaking purpose. The entity will usually stay an entity as long as you forego any editing whatsoever in Book View. Look at Book View ... Edit in Code View. And in the Clean Source preferences, make sure "Pretty Print" is selected instead of HTML Tidy. If you can build Sigil from source, there's been a few new patches accepted -- one of which provides a "Preserve Entities" feature that will help make sure non-breaking space entities don't get zapped when editing in book view. Last edited by DiapDealer; 08-28-2014 at 10:01 PM. |
|
08-29-2014, 06:15 AM | #7 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
IMHO you'd be better off shortening your ellipses to ... as at least some of the Big Five publishers are doing in their digital editions, while retaining the traditional mode in print.
I know how irritating it is when somebody answers a question by replying to a different question, but I went through this issue years ago when the Kindle was first introduced (the ellipses breaking at the end of a line), until I decided that on the digital "page" the space looked rather silly. |
08-29-2014, 05:09 PM | #8 | |
Ex-Helpdesk Junkie
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
|
|
08-29-2014, 09:52 PM | #9 | |
null operator (he/him)
Posts: 20,653
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
blah blah blah . . .! More blah blah On paper I prefer single curly quotes for dialogue, on digital I prefer double curly quotes. Maybe ebooks could be user configurable BR |
|
08-29-2014, 09:56 PM | #10 | |
Well trained by Cats
Posts: 29,944
Karma: 55705602
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
|
|
08-30-2014, 12:02 AM | #11 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Quote:
Some fonts also have oddities with the ellipsis character, in which the "dots" don't match your typical period, or do not have similar kerning to the default period, making it look quite odd when they are near eachother. I had a large post typed up covering my personal annoyances with ellipses in EPUBs, but then scrapped it. Here are some more resources on the topic: https://english.stackexchange.com/qu...s-for-ellipses http://www.thebookdesigner.com/2013/...dobe-indesign/ https://tex.stackexchange.com/questi...xetex-document Different Style Guides and different languages also have different rules. Ultimately, ellipses are a a huge pain in the bottom, and these "Smarten Punctuation" algorithms completely mangle them. Last edited by Tex2002ans; 08-30-2014 at 12:21 AM. |
||
08-30-2014, 10:56 AM | #12 |
mostly an observer
Posts: 1,515
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Actually, I think in standard book-making, all ellipses are three characters. When a fourth is added, then it is a full stop (period) and not technically part of the ellipsis. (It could be a question mark, exclamation mark, or even a comma or semi-colon or colon instead.)
I was assuming that three or even four dots without a space between would be regarded as a single word by most or all e-book platforms. Am I wrong about that? The only case I can think of where a four-dot ellipsis would change under my e-book formula is where the omission comes at the beginning of the following sentence. In a printed book, I would go dot/space/dot/space/dot/space/dot/space, but in an e-book I would go dot/space/dot/dot/dot/space. Last edited by Notjohn; 08-30-2014 at 10:58 AM. Reason: oops! omitted a dot! |
08-30-2014, 11:32 AM | #13 |
frumious Bandersnatch
Posts: 7,516
Karma: 19000001
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Probably. I've seen too many linebreaks before/after question/quote marks, with or without a hyphen, to keep any faith I initially had on the linebreaking algorithms of ebook readers.
|
08-31-2014, 07:35 PM | #14 | |||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Also, the typographer of an older book may have had older Style Guides that completely disagreed with the rules now given by the modern versions. You also have to keep in mind that other languages/countries may have their own typography and Style Guides. I don't have any samples of these older books on hand, although I do remember it in a handful of Archive.org scans I have digitized. Side Note: This reminds me of this fascinating article, "Why two spaces after a period isn’t wrong (or, the lies typographers tell about history)". The author goes through even these older Style Guides themselves (like new/older versions of the Chicago Manual of Style) and demolishes this "double-spacing" myth! I assume something similar could be said about ellipses: http://www.heracliteanriver.com/?p=324 Similarly, asterisks were used along the same lines in the middle of paragraphs: Quote:
In many cases, it is hard to tell exactly what the typographer was thinking, so it is very hard to "reverse" or "modernize" the decision. Is this ACTUALLY missing text, or was it a pause, was it a punctuation mark in the original text that is being quoted, was it just a design decision, ...? Quote:
If you wanted to stick with the ellipsis character, you would have to insert a non-breaking space to connect the ellipsis to the period before/after in order for them to stick together. So, "ellipsis + nbsp + period" and "period + nbsp + ellipsis" should work... although again having a space there isn't necessarily the proper way according to certain Style Guides. Quote:
As I said, a huge pain in the butt! Quote:
To fix this, you would have to probably insert something like a zero-width space, although this would create HIDEOUSLY ugly code...... and devices probably do not have very good support for that, so zero-width spaces will show up as "missing character" boxes or quotation marks. Bleh! Also, I just thought of another thing devices might break on, SEARCH. It is much easier to search for a three or four periods in a row, than it is to search for text with an ellipsis character. Last edited by Tex2002ans; 08-31-2014 at 07:55 PM. |
|||||
09-01-2014, 01:20 AM | #15 |
null operator (he/him)
Posts: 20,653
Karma: 26966376
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@Tex2002ans - the author of the Heraclitean River blog and I are like minded in that he and I would prefer 1.5 spaces between sentences rather than one or two.
I just feel more comfortable if the space between sentences exceeds the space between words. I edit to two spaces using a regular space & a non breaking space. If I wanted 1.5 spaces what would you suggest I use. My 'target font' is Times Roman 12 point, if that has anything to do with the price of fish. Thanks. BR Last edited by BetterRed; 09-01-2014 at 01:44 AM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
0.8.63 strips non-breaking spaces when converting from epub to mobi | veezh | Calibre | 7 | 08-04-2012 08:39 AM |
[Old Thread] PDF to Epub conversion (spaces between letters) | mastroalex | Conversion | 8 | 10-09-2011 10:39 PM |
Blank spaces after header is removed | Mamaijee | Conversion | 2 | 05-26-2011 01:17 PM |
Non breaking spaces? | troymc | Sigil | 6 | 05-22-2010 07:47 AM |