Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-22-2013, 09:44 AM   #1
artoros
Junior Member
artoros began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Oct 2008
Device: none
[old thread] non breaking spaces (* and  ) automatically removed

Hello,
I have a really big problem with non breaking spaces, that are not written as " " but as "& #160;" or " ", what - as far as I understand it - is just the decimal or hexadecimal way of writing a non breaking space.

In the 0.5x.0 versions Sigil automatically replaced these with " " what was ok, since it was the same thing.

But in the 0.6 and now the 0.7. version Sigil just replaces them with "normal" spaces. When you use non breaking spaces for your layout, this is a big problem, since in HTML more than space is just treated as one single space.

Is there a way to disable this behaviour? I have to work with EPUBs that were created not by me, EPUB files, that have these non breaking spaces in it.

I tried to uncheck the checkboxes in the preferences "Automatically clean and format html source", but that does not work.

Sigil does that replacement when I open the EPUB, so there is no way of getting around it, by replacing the spaces myself with a search/replace or something like that.

Does anyone have an idea, because that bug (or feature?) makes it impossible for me to use newer Sigil versions and I have to use the old 0.5.3 version :-(

Thanks and bye
Artoros
artoros is offline   Reply With Quote
Old 02-22-2013, 11:44 AM   #2
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
I can see the issue. * is converted to a space instead of being left as * or converted to  . This is primarily an issue with nbsp since its special in that Book View always changes the actual nbsp character to a space - so we have code that converts the nbsp character to the   entity. Need to check if we need to do the same for * or handle in a different way.
meme is offline   Reply With Quote
Old 02-27-2013, 04:46 AM   #3
BobC
Guru
BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.BobC ought to be getting tired of karma fortunes by now.
 
Posts: 691
Karma: 3026110
Join Date: Dec 2008
Location: Lancashire, U.K.
Device: BeBook 1, BeBook Pure, Kobo Glo, (and HD),Energy Sistem EReader Pro +
On a related note I am getting errors with non-breaking spaces when validating books - the error reported is
Code:
entity 'nbsp' not found
.

When I examine the html the error points to various entries such as :
Code:
‘But . . . I don’t—’
.

Which looks valid HTML to me.

While I can get rid of the "errors" by cleaning the file I can't see why it needs cleaning.

BobC

Last edited by BobC; 02-27-2013 at 02:24 PM. Reason: Clean up error message
BobC is offline   Reply With Quote
Old 02-27-2013, 02:57 PM   #4
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,274
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
This usually indicates a problem in your header's document type. The type is defined for something that doesn't know how to deal with an & nbsp ; entity. Just compare the header from the uncleaned and the cleaned versions to see the difference.
meme is offline   Reply With Quote
Old 08-28-2014, 06:58 PM   #5
sjkramer
Connoisseur
sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.sjkramer ought to be getting tired of karma fortunes by now.
 
Posts: 68
Karma: 526028
Join Date: Nov 2009
Location: New York, NY
Device: iphone
Is there something that I can "switch off" to prevent my nonbreaking spaces from being turned into regular old spaces?
sjkramer is offline   Reply With Quote
Old 08-28-2014, 09:59 PM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,854
Karma: 207000000
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by sjkramer View Post
Is there something that I can "switch off" to prevent my nonbreaking spaces from being turned into regular old spaces?
The character or the entity?

The character will always be converted to an entity. It can't survive the Qt Widget being used (it will be changed to a normal space), so Sigil converts it to an entity first (when opening the epub) so that it can still fulfill its non-breaking purpose. The entity will usually stay an entity as long as you forego any editing whatsoever in Book View. Look at Book View ... Edit in Code View. And in the Clean Source preferences, make sure "Pretty Print" is selected instead of HTML Tidy.

If you can build Sigil from source, there's been a few new patches accepted -- one of which provides a "Preserve Entities" feature that will help make sure non-breaking space entities don't get zapped when editing in book view.

Last edited by DiapDealer; 08-28-2014 at 10:01 PM.
DiapDealer is offline   Reply With Quote
Old 08-29-2014, 06:15 AM   #7
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,519
Karma: 996810
Join Date: Dec 2012
Device: Kindle
Quote:
Originally Posted by BobC View Post
‘But . . . I don’t—’
IMHO you'd be better off shortening your ellipses to ... as at least some of the Big Five publishers are doing in their digital editions, while retaining the traditional mode in print.

I know how irritating it is when somebody answers a question by replying to a different question, but I went through this issue years ago when the Kindle was first introduced (the ellipses breaking at the end of a line), until I decided that on the digital "page" the space looked rather silly.
Notjohn is offline   Reply With Quote
Old 08-29-2014, 05:09 PM   #8
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by Notjohn View Post
IMHO you'd be better off shortening your ellipses to ... as at least some of the Big Five publishers are doing in their digital editions, while retaining the traditional mode in print.

I know how irritating it is when somebody answers a question by replying to a different question, but I went through this issue years ago when the Kindle was first introduced (the ellipses breaking at the end of a line), until I decided that on the digital "page" the space looked rather silly.
…Or use an actual ellipsis… (Like I just did.)
eschwartz is offline   Reply With Quote
Old 08-29-2014, 09:52 PM   #9
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,005
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Notjohn View Post
IMHO you'd be better off shortening your ellipses to ... as at least some of the Big Five publishers are doing in their digital editions, while retaining the traditional mode in print.
Ay, there's the rub, IMO what works well on paper - "blah blah blah . . .! More blah blah." - doesn't always work so well on digital media, especially if it spans two lines as in:

blah blah blah . .
.! More blah blah

On paper I prefer single curly quotes for dialogue, on digital I prefer double curly quotes.

Maybe ebooks could be user configurable

BR
BetterRed is offline   Reply With Quote
Old 08-29-2014, 09:56 PM   #10
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,240
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by BetterRed View Post
Ay, there's the rub, IMO what works well on paper - "blah blah blah . . .! More blah blah." - doesn't always work so well on digital media, especially if it spans two lines as in:

blah blah blah . .
.! More blah blah

On paper I prefer single curly quotes for dialogue, on digital I prefer double curly quotes.

Maybe ebooks could be user configurable

BR
& hellip; is single char (and is available on the omega icon tool) no more break worries
theducks is offline   Reply With Quote
Old 08-30-2014, 12:02 AM   #11
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by theducks View Post
& hellip; is single char (and is available on the omega icon tool) no more break worries
Not necessarily. There is the situation that can occur like this:

Quote:
"etc., etc.
…"
There are also many books that use "four dot ellipsis" or some older books used even more. The ellipsis character does not work well in that situation.

Some fonts also have oddities with the ellipsis character, in which the "dots" don't match your typical period, or do not have similar kerning to the default period, making it look quite odd when they are near eachother.

I had a large post typed up covering my personal annoyances with ellipses in EPUBs, but then scrapped it.

Here are some more resources on the topic:

https://english.stackexchange.com/qu...s-for-ellipses
http://www.thebookdesigner.com/2013/...dobe-indesign/
https://tex.stackexchange.com/questi...xetex-document

Different Style Guides and different languages also have different rules.

Ultimately, ellipses are a a huge pain in the bottom, and these "Smarten Punctuation" algorithms completely mangle them.

Last edited by Tex2002ans; 08-30-2014 at 12:21 AM.
Tex2002ans is offline   Reply With Quote
Old 08-30-2014, 10:56 AM   #12
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,519
Karma: 996810
Join Date: Dec 2012
Device: Kindle
Actually, I think in standard book-making, all ellipses are three characters. When a fourth is added, then it is a full stop (period) and not technically part of the ellipsis. (It could be a question mark, exclamation mark, or even a comma or semi-colon or colon instead.)

I was assuming that three or even four dots without a space between would be regarded as a single word by most or all e-book platforms. Am I wrong about that?

The only case I can think of where a four-dot ellipsis would change under my e-book formula is where the omission comes at the beginning of the following sentence. In a printed book, I would go dot/space/dot/space/dot/space/dot/space, but in an e-book I would go dot/space/dot/dot/dot/space.

Last edited by Notjohn; 08-30-2014 at 10:58 AM. Reason: oops! omitted a dot!
Notjohn is offline   Reply With Quote
Old 08-30-2014, 11:32 AM   #13
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,570
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Notjohn View Post
I was assuming that three or even four dots without a space between would be regarded as a single word by most or all e-book platforms. Am I wrong about that?
Probably. I've seen too many linebreaks before/after question/quote marks, with or without a hyphen, to keep any faith I initially had on the linebreaking algorithms of ebook readers.
Jellby is offline   Reply With Quote
Old 08-31-2014, 07:35 PM   #14
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Notjohn View Post
Actually, I think in standard book-making, all ellipses are three characters. When a fourth is added, then it is a full stop (period) and not technically part of the ellipsis.
In modern English typography... perhaps. Different Style Guides may or may not agree.

Also, the typographer of an older book may have had older Style Guides that completely disagreed with the rules now given by the modern versions. You also have to keep in mind that other languages/countries may have their own typography and Style Guides.

I don't have any samples of these older books on hand, although I do remember it in a handful of Archive.org scans I have digitized.

Side Note: This reminds me of this fascinating article, "Why two spaces after a period isn’t wrong (or, the lies typographers tell about history)". The author goes through even these older Style Guides themselves (like new/older versions of the Chicago Manual of Style) and demolishes this "double-spacing" myth! I assume something similar could be said about ellipses:

http://www.heracliteanriver.com/?p=324

Similarly, asterisks were used along the same lines in the middle of paragraphs:

Quote:
Here is an ending sample sentence. * * * * * Here is some more sentences of text. And continuing.
If I recall correctly, these were used as:
  • Rough way for certain typographers to be able to squeeze/push widows/orphans off of further pages
  • Help "square off" the bottom of pages
  • Section breaks
  • Alternative to show "missing text"

In many cases, it is hard to tell exactly what the typographer was thinking, so it is very hard to "reverse" or "modernize" the decision.

Is this ACTUALLY missing text, or was it a pause, was it a punctuation mark in the original text that is being quoted, was it just a design decision, ...?

Quote:
Originally Posted by Notjohn View Post
I was assuming that three or even four dots without a space between would be regarded as a single word by most or all e-book platforms. Am I wrong about that?
According to my testing, three or four periods in a row with NO SPACES between would be ok. Although as Jellby stated, I have seen the "ellipsis + period" or "period + ellipsis" break according to the linebreak algorithms.

If you wanted to stick with the ellipsis character, you would have to insert a non-breaking space to connect the ellipsis to the period before/after in order for them to stick together. So, "ellipsis + nbsp + period" and "period + nbsp + ellipsis" should work... although again having a space there isn't necessarily the proper way according to certain Style Guides.

Quote:
Originally Posted by Notjohn View Post
The only case I can think of where a four-dot ellipsis would change under my e-book formula is where the omission comes at the beginning of the following sentence. In a printed book, I would go dot/space/dot/space/dot/space/dot/space, but in an e-book I would go dot/space/dot/dot/dot/space.
And you also have to keep in mind all of the spacing rules for certain punctuation, how do you handle not just periods before/after, but commas, quotation marks, question marks, exclamation points, brackets, parenthesis, etc. etc. The situation gets a lot hairier than you first expect, and many of these are hard, and have to be decided on a case by case basis according to context.

As I said, a huge pain in the butt!

Quote:
Originally Posted by Jellby View Post
Probably. I've seen too many linebreaks before/after question/quote marks, with or without a hyphen, to keep any faith I initially had on the linebreaking algorithms of ebook readers.
Yep, and I believe when I first started, I saw the "ellipsis + period/question mark/exclamation point/quote mark" change into a "period + linebreak + punctuation", which is why is another reason why I abandoned using the ellipsis character.

To fix this, you would have to probably insert something like a zero-width space, although this would create HIDEOUSLY ugly code...... and devices probably do not have very good support for that, so zero-width spaces will show up as "missing character" boxes or quotation marks. Bleh!

Also, I just thought of another thing devices might break on, SEARCH. It is much easier to search for a three or four periods in a row, than it is to search for text with an ellipsis character.

Last edited by Tex2002ans; 08-31-2014 at 07:55 PM.
Tex2002ans is offline   Reply With Quote
Old 09-01-2014, 01:20 AM   #15
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,005
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Tex2002ans - the author of the Heraclitean River blog and I are like minded in that he and I would prefer 1.5 spaces between sentences rather than one or two.

I just feel more comfortable if the space between sentences exceeds the space between words. I edit to two spaces using a regular space & a non breaking space. If I wanted 1.5 spaces what would you suggest I use.

My 'target font' is Times Roman 12 point, if that has anything to do with the price of fish.

Thanks.

BR

Last edited by BetterRed; 09-01-2014 at 01:44 AM.
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
0.8.63 strips non-breaking spaces when converting from epub to mobi veezh Calibre 7 08-04-2012 08:39 AM
[Old Thread] PDF to Epub conversion (spaces between letters) mastroalex Conversion 8 10-09-2011 10:39 PM
Blank spaces after header is removed Mamaijee Conversion 2 05-26-2011 01:17 PM
Non breaking spaces? troymc Sigil 6 05-22-2010 07:47 AM


All times are GMT -4. The time now is 08:10 PM.


MobileRead.com is a privately owned, operated and funded community.