![]() |
#1 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 75
Karma: 500000
Join Date: Oct 2011
Location: Utah
Device: iPad
|
Bug with zero-width space Unicode
Well, I wanted to officially submit an issue, but the Sigil website says I should just fix the bug myself and submit a patch. As I'm no programmer, I guess this is the next best thing!
I've got a book with a few paragraphs that are meant to have no spaces in them. I told the ebook formatting people to use zero-width spaces between where the word boundaries would normally be, so the lines still wrap. Here is what the html looks like in a text editor: < p>Ah& #8203;but& #8203;they& #8203;were& #8203;left& #8203;behind& #8203;It& #8203;is& #8203;obvious& #8203;from& #8203;then& #8203;ature& #8203;of& #8203;the& #8203;bond& #8203;But& #8203;where& #8203;where& #8203;where& #8203;where& #8203;Setoff& #8203;Obvious& #8203;Realization& #8203;like& #8203;a& #8203;pricity& #8203;They& #8203;are& #8203;with& #8203;the& #8203;Shin& #8203;We& #8203;must& #8203;find& #8203;one& #8203;Can& #8203;we& #8203;make & #8203;to& #8203;use& #8203;a& #8203;Truthless& #8203;Can& #8203;we& #8203;craft& #8203;a& #8203;weapon< /p> (I put a space in each html entity and tag so it will display here, but there's no space in the html itself.) This looks perfect in all ebook readers, except for in amzn-mobi, for which we get around the issue with a media query. The problem is how it displays in Sigil. Here is how that paragraph displays in Sigil, in the html view (not wysiwyg): < p>AhbuttheywereleftbehindItisobviousfrom thenatureضthebondButwherewherewherewher eSetoff/¢∂ص≥Realizationlikeapricity4®•πarewiththe ShinWemustfindoneCanwemaketouseaTruthl ess#°Æwecraftaweapon< /p> So as you can see, some of the individual words get changed to garbage characters. (Also, the forum software here is adding some spaces.) However, this is only the way it displays. The underlying text looks normal—Sigil is converting all of the zero-width html entities to actual zero-width Unicode characters. And the intervening characters that look like garbage above are not actually garbage—if you save the file and look at it in a text editor, those characters look fine. And if I look at it in a hex editor, each zero-width space is E2808B exactly as expected for zero-width space Unicode. But this is not very useful in Sigil—it looks buggy as I mentioned above, and it runs all the text together as if the zero-width spaces aren't there. (That's what it's supposed to look like, facing the end user—not in the html code, which should give an indication that the zero-width spaces are there so an editor can do something with them.) |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
This is most probably a bug in QT, the underlying framework. If so, there is not much you can do. How does it look in the Calibre viewer?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 75
Karma: 500000
Join Date: Oct 2011
Location: Utah
Device: iPad
|
Even if there is a bug in QT, I think a better default behavior would be for & #8203; to be left as-is and not converted to straight Unicode, the same way & #160; is left as-is.
|
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,358
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I don't believe there's going to be any modifications to Sigil anytime soon. It's at a very "if -it-doesn't-work-for-you-as-is-(and you can't change it yourself)-you're-probably-going-to-want-to-use-something-else" stage in it's development.
|
![]() |
![]() |
![]() |
#5 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Hitch Last edited by Hitch; 02-27-2014 at 02:19 AM. Reason: Snipped for some brevity. |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Just before they gave up, the developers of Sigil seesawed back and forth on how to do this and I don't think that anyone was happy with any iteration of it. If it works for one group, it doesn't for another, particularly if they are dumping web page junk into it.
|
![]() |
![]() |
![]() |
#7 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 75
Karma: 500000
Join Date: Oct 2011
Location: Utah
Device: iPad
|
"Just before they gave up"? Oh. I wasn't aware it's essentially abandoned. Well, it does work quite well for what it does. I guess if I need zero-width spaces in the future, I'll use BBEdit for those bits.
|
![]() |
![]() |
![]() |
#8 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Making epub happen: Sigil's Spiritual Successor
Quote:
Last edited by eschwartz; 02-28-2014 at 01:20 AM. |
|
![]() |
![]() |
![]() |
#9 | |
Bookmaker & Cat Slave
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,503
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Hitch |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Space Captain Smith of the British Space Empire | Kacetwo | Deals and Resources (No Self-Promotion or Affiliate Links) | 4 | 07-02-2012 03:41 AM |
Embedded font bug or CSS bug in ADE | JSWolf | ePub | 10 | 06-11-2011 02:34 PM |
ADE bug in calculating width | DaleDe | ePub | 6 | 01-17-2010 09:33 AM |
Reader adds space after unicode characters... | bmfrosty | Astak EZReader | 2 | 07-16-2009 08:53 PM |