| 
 | |||||||
|  | 
|  | Thread Tools | Search this Thread | 
|  02-22-2013, 09:44 AM | #1 | 
| Junior Member  Posts: 6 Karma: 10 Join Date: Oct 2008 Device: none | 
				
				[old thread] non breaking spaces (* and  ) automatically removed
			 
			
			Hello,  I have a really big problem with non breaking spaces, that are not written as " " but as "& #160;" or " ", what - as far as I understand it - is just the decimal or hexadecimal way of writing a non breaking space. In the 0.5x.0 versions Sigil automatically replaced these with " " what was ok, since it was the same thing. But in the 0.6 and now the 0.7. version Sigil just replaces them with "normal" spaces. When you use non breaking spaces for your layout, this is a big problem, since in HTML more than space is just treated as one single space. Is there a way to disable this behaviour? I have to work with EPUBs that were created not by me, EPUB files, that have these non breaking spaces in it. I tried to uncheck the checkboxes in the preferences "Automatically clean and format html source", but that does not work. Sigil does that replacement when I open the EPUB, so there is no way of getting around it, by replacing the spaces myself with a search/replace or something like that. Does anyone have an idea, because that bug (or feature?) makes it impossible for me to use newer Sigil versions and I have to use the old 0.5.3 version :-( Thanks and bye Artoros | 
|   |   | 
|  02-22-2013, 11:44 AM | #2 | 
| Sigil developer            Posts: 1,274 Karma: 1101600 Join Date: Jan 2011 Location: UK Device: Kindle PW, K4 NT, K3, Kobo Touch | 
			
			I can see the issue.  * is converted to a space instead of being left as * or converted to  .  This is primarily an issue with nbsp since its special in that Book View always changes the actual nbsp character to a space - so we have code that converts the nbsp character to the   entity.  Need to check if we need to do the same for * or handle in a different way.
		 | 
|   |   | 
|  02-27-2013, 04:46 AM | #3 | 
| Guru            Posts: 691 Karma: 3026110 Join Date: Dec 2008 Location: Lancashire, U.K. Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro + | 
			
			On a related note I am getting errors with non-breaking spaces when validating books - the error reported is  Code: entity 'nbsp' not found When I examine the html the error points to various entries such as : Code: ‘But . . . I don’t—’ Which looks valid HTML to me. While I can get rid of the "errors" by cleaning the file I can't see why it needs cleaning. BobC Last edited by BobC; 02-27-2013 at 02:24 PM. Reason: Clean up error message | 
|   |   | 
|  02-27-2013, 02:57 PM | #4 | 
| Sigil developer            Posts: 1,274 Karma: 1101600 Join Date: Jan 2011 Location: UK Device: Kindle PW, K4 NT, K3, Kobo Touch | 
			
			This usually indicates a problem in your header's document type.  The type is defined for something that doesn't know how to deal with an & nbsp ; entity.   Just compare the header from the uncleaned and the cleaned versions to see the difference.
		 | 
|   |   | 
|  08-28-2014, 06:58 PM | #5 | 
| Connoisseur            Posts: 68 Karma: 526028 Join Date: Nov 2009 Location: New York, NY Device: iphone | 
			
			Is there something that I can "switch off" to prevent my nonbreaking spaces from being turned into regular old spaces?
		 | 
|   |   | 
|  08-28-2014, 09:59 PM | #6 | |
| Grand Sorcerer            Posts: 28,882 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Quote: 
 The character will always be converted to an entity. It can't survive the Qt Widget being used (it will be changed to a normal space), so Sigil converts it to an entity first (when opening the epub) so that it can still fulfill its non-breaking purpose. The entity will usually stay an entity as long as you forego any editing whatsoever in Book View. Look at Book View ... Edit in Code View. And in the Clean Source preferences, make sure "Pretty Print" is selected instead of HTML Tidy. If you can build Sigil from source, there's been a few new patches accepted -- one of which provides a "Preserve Entities" feature that will help make sure non-breaking space entities don't get zapped when editing in book view. Last edited by DiapDealer; 08-28-2014 at 10:01 PM. | |
|   |   | 
|  08-29-2014, 06:15 AM | #7 | 
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | 
			
			IMHO you'd be better off shortening your ellipses to ... as at least some of the Big Five publishers are doing in their digital editions, while retaining the traditional mode in print.  I know how irritating it is when somebody answers a question by replying to a different question, but I went through this issue years ago when the Kindle was first introduced (the ellipses breaking at the end of a line), until I decided that on the digital "page" the space looked rather silly. | 
|   |   | 
|  08-29-2014, 05:09 PM | #8 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | Quote: 
 | |
|   |   | 
|  08-29-2014, 09:52 PM | #9 | |
| null operator (he/him)            Posts: 22,010 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
 blah blah blah . . .! More blah blah On paper I prefer single curly quotes for dialogue, on digital I prefer double curly quotes. Maybe ebooks could be user configurable  BR | |
|   |   | 
|  08-29-2014, 09:56 PM | #10 | |
| Well trained by Cats            Posts: 31,250 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | Quote: 
 | |
|   |   | 
|  08-30-2014, 12:02 AM | #11 | ||
| Wizard            Posts: 2,306 Karma: 13057279 Join Date: Jul 2012 Device: Kobo Forma, Nook | Quote: 
 Quote: 
 Some fonts also have oddities with the ellipsis character, in which the "dots" don't match your typical period, or do not have similar kerning to the default period, making it look quite odd when they are near eachother. I had a large post typed up covering my personal annoyances with ellipses in EPUBs, but then scrapped it. Here are some more resources on the topic: https://english.stackexchange.com/qu...s-for-ellipses http://www.thebookdesigner.com/2013/...dobe-indesign/ https://tex.stackexchange.com/questi...xetex-document Different Style Guides and different languages also have different rules. Ultimately, ellipses are a a huge pain in the bottom, and these "Smarten Punctuation" algorithms completely mangle them. Last edited by Tex2002ans; 08-30-2014 at 12:21 AM. | ||
|   |   | 
|  08-30-2014, 10:56 AM | #12 | 
| mostly an observer            Posts: 1,519 Karma: 996810 Join Date: Dec 2012 Device: Kindle | 
			
			Actually, I think in standard book-making, all ellipses are three characters. When a fourth is added, then it is a full stop (period) and not technically part of the ellipsis. (It could be a question mark, exclamation mark, or even a comma or semi-colon or colon instead.) I was assuming that three or even four dots without a space between would be regarded as a single word by most or all e-book platforms. Am I wrong about that? The only case I can think of where a four-dot ellipsis would change under my e-book formula is where the omission comes at the beginning of the following sentence. In a printed book, I would go dot/space/dot/space/dot/space/dot/space, but in an e-book I would go dot/space/dot/dot/dot/space. Last edited by Notjohn; 08-30-2014 at 10:58 AM. Reason: oops! omitted a dot! | 
|   |   | 
|  08-30-2014, 11:32 AM | #13 | 
| frumious Bandersnatch            Posts: 7,570 Karma: 20150435 Join Date: Jan 2008 Location: Spaniard in Sweden Device: Cybook Orizon, Kobo Aura | 
			
			Probably. I've seen too many linebreaks before/after question/quote marks, with or without a hyphen, to keep any faith I initially had on the linebreaking algorithms of ebook readers.
		 | 
|   |   | 
|  08-31-2014, 07:35 PM | #14 | |||||
| Wizard            Posts: 2,306 Karma: 13057279 Join Date: Jul 2012 Device: Kobo Forma, Nook | Quote: 
 Also, the typographer of an older book may have had older Style Guides that completely disagreed with the rules now given by the modern versions. You also have to keep in mind that other languages/countries may have their own typography and Style Guides. I don't have any samples of these older books on hand, although I do remember it in a handful of Archive.org scans I have digitized. Side Note: This reminds me of this fascinating article, "Why two spaces after a period isn’t wrong (or, the lies typographers tell about history)". The author goes through even these older Style Guides themselves (like new/older versions of the Chicago Manual of Style) and demolishes this "double-spacing" myth! I assume something similar could be said about ellipses: http://www.heracliteanriver.com/?p=324 Similarly, asterisks were used along the same lines in the middle of paragraphs: Quote: 
 
 In many cases, it is hard to tell exactly what the typographer was thinking, so it is very hard to "reverse" or "modernize" the decision. Is this ACTUALLY missing text, or was it a pause, was it a punctuation mark in the original text that is being quoted, was it just a design decision, ...? Quote: 
 If you wanted to stick with the ellipsis character, you would have to insert a non-breaking space to connect the ellipsis to the period before/after in order for them to stick together. So, "ellipsis + nbsp + period" and "period + nbsp + ellipsis" should work... although again having a space there isn't necessarily the proper way according to certain Style Guides. Quote: 
 As I said, a huge pain in the butt!  Quote: 
 To fix this, you would have to probably insert something like a zero-width space, although this would create HIDEOUSLY ugly code...... and devices probably do not have very good support for that, so zero-width spaces will show up as "missing character" boxes or quotation marks. Bleh! Also, I just thought of another thing devices might break on, SEARCH. It is much easier to search for a three or four periods in a row, than it is to search for text with an ellipsis character. Last edited by Tex2002ans; 08-31-2014 at 07:55 PM. | |||||
|   |   | 
|  09-01-2014, 01:20 AM | #15 | 
| null operator (he/him)            Posts: 22,010 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | 
			
			@Tex2002ans - the author of the Heraclitean River blog and I are like minded in that he and I would prefer 1.5 spaces between sentences rather than one or two.   I just feel more comfortable if the space between sentences exceeds the space between words. I edit to two spaces using a regular space & a non breaking space. If I wanted 1.5 spaces what would you suggest I use. My 'target font' is Times Roman 12 point, if that has anything to do with the price of fish. Thanks. BR Last edited by BetterRed; 09-01-2014 at 01:44 AM. | 
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| 0.8.63 strips non-breaking spaces when converting from epub to mobi | veezh | Calibre | 7 | 08-04-2012 08:39 AM | 
| [Old Thread] PDF to Epub conversion (spaces between letters) | mastroalex | Conversion | 8 | 10-09-2011 10:39 PM | 
| Blank spaces after header is removed | Mamaijee | Conversion | 2 | 05-26-2011 01:17 PM | 
| Non breaking spaces? | troymc | Sigil | 6 | 05-22-2010 07:47 AM |