View Single Post
Old 07-13-2021, 02:09 PM   #11
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by arakish View Post
What I know how to do is write the Unicode characters in this format:
Where/How are you initially writing these documents?

Sigil automatically handles the UTF-16 -> UTF-8 conversion upon opening.

... but it would probably be better to keep your source documents in UTF-8 in the first place.

Quote:
Originally Posted by arakish View Post
I do it using the �[number]; format for HTML web documents. Tried it in Sigil. Worked until saving the file with UTF-16.

Using the UTF-8, and Sigil will not even show the characters.

Thus, next question: Is there software that would convert a Unicode number such as "�" (Waning Gibbous Moon) into the UTF-8 equivalent?
Sigil handles/displays all those characters perfectly fine.

If you typed the HTML Entities in your original code:
  • 🌔 = WAXING GIBBOUS MOON
  • 🌖 = WANING GIBBOUS MOON

Sigil helpfully converts everything into the actual, human-readable characters:
  • 🌔 (U+1F314) = WAXING GIBBOUS MOON
  • 🌖 (U+1F316) = WANING GIBBOUS MOON

All are converted to their actual characters besides:
  • > = Greater Than
  • < = Less Than
  • & = Ampersand
  •   or   = Non-Breaking Spaces

Quote:
Originally Posted by arakish View Post
⅜ is the Fractional Three-Eighths character, but Sigil will only show them if I use UTF-16 or UTF-32 in the XML tag.
Not a good idea to use Vulgar Fractions.

See my post in 2019: "I'm assuming it's the font's fault, but just in case ..."

Quote:
Originally Posted by arakish View Post
No not any asian script language. I want to use the characters on this Code Chart or this one for example. There are other Code Charts I want to use, but it seems Sigil will only show such with decimal numbers below 1024, perhaps 2048.
You can enter the hex or decimal form, and Sigil will automatically convert to the characters for you...

Or even better:

You can insert the character directly using your OS's Character Map (or similar program): Personally, on Windows, I like to use BabelMap.

Or copy/paste characters from Fileformat.info's Unicode Search. For example, here was my search for "Gibbous Moon".

Quote:
Originally Posted by KevinH View Post
Using rarely seen unicode characters in an epub will almost always require embedding a font that supports it so that readers can show it properly.


I can guarantee a symbol like:

🜊 (U+1F70A) = ALCHEMICAL SYMBOL FOR VINEGAR

doesn't exist in ereader's fonts.

Follow similar code practices like I showed in the Japanese font thread. Do something like:

Code:
Vinegar <span class="alchemy">🜊</span> is an acidic thing.
then embed a font specifically for those symbols.

Symbola is a font that contains many of those obscure symbols.

Last edited by Tex2002ans; 07-13-2021 at 03:12 PM.
Tex2002ans is offline   Reply With Quote