View Single Post
Old 01-18-2019, 07:53 PM   #29
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Psymon View Post
..and as soon as I do it comes out exactly as I wanted, the lig doesn't render out.

But then when I save my book, it disappears!

Or, rather, it doesn't disappear, but the "visible" character entity disappears, and yet it's still there, still preventing my lig to render out. It's like that character entity gets converted into an actual zero-width joiner.
That's exactly what happens, Sigil/Calibre converts most "named entities" to the actual UTF-8 character. And since the zero-width joiner is "invisible" (zero-width)... that makes it hard to find. :P

In Sigil, you could go under Edit > Preferences > Preserve Entities, and add:

Code:
‍
to the list. That should preserve it on open/save.

In EPUB3, HTML "named entities" are not allowed, so you have to use the decimal form instead:

Code:
& #8205;
(Remove the space between the "& #".)

Quote:
Originally Posted by Psymon View Post
Definitely not so in my epub, though -- as I said, in there it's literally invisible, with no space or anything in-between the long-s and t characters.
Yeah, it becomes a pain to find invisible characters.

You can see it exists if you do a Tools > Reports > Characters in HTML Files.

Or you can do a Regex search and use this:

Search: \x{200D}

That's the hex for a zero-width joiner.

See Regular-Expresssions.info's article on "Non-Printable Characters.

Quote:
Originally Posted by Psymon View Post
Oh, how weird. When I copy/paste that bit of code here into this forum post, and then preview my post (and only after I preview my post, not before), that zero-width joiner gets converted to an asterisk!
Yeah, who knows what strange ways these characters will react with MobileRead's forum posts.

Usually you can slap a noparse tag: (Remove the asterisks.)

Code:
[*noparse][/*noparse]
around stuff to suppress the forum from adding smilies (and other crap) in specific locations.

Side Note: I know there was an issue a while back I had with obscure Unicode characters disappearing when using "Quick Reply" or "Editing" my posts. But using "Advanced Reply" preserved a lot of the characters.

This Unicode 7.0 post I wrote was the culprit.

I doubt that bug has been fixed... I doubt many people are writing these characters in their posts, so the priority gets pushed down to the bottom. :P

* * *

Side Note: Those extremely interested in ligatures may also want to check out this speech, "Selective Ligature Suppression" given at TUG 2018.

It's based on a LaTeX package that was created, but a lot of the discussion applies to ligatures/typography more broadly.

While in most cases, ligatures are fine, there are rare exceptions which should be suppressed along "morpheme boundaries". Example:
  • "shelfful" should have the "ff" ligature suppressed
  • "clifflike" should have "ffl" suppressed, but allow "ff"

(Also another blog post I've been planning to write. Summaries of every talk I watch. :P)

Side Note to Side Note: And honestly, the actual title on Youtube is "TUG 2018 - Conference - Mico Loretan"... makes it absolutely impossible to find via search, and you have zero clue wtf the speech is unless you watched it.

That issue seriously plagues a lot of these obscure talks, so I hope to make them much more findable by promoting/summarizing them. :P

Last edited by Tex2002ans; 01-18-2019 at 08:17 PM.
Tex2002ans is offline   Reply With Quote