![]() |
Automatic change of coded HTML Entities into special chars in Sigil 0.9.9 upon saving
1 Attachment(s)
Hi all!
I am now dealing with a very special book using a lot of math chars. Most of them can are embedded in the fonts we use, like ¬, ∈, ∀ or ∃. But some are not, e.g. ℕ (symbol of natural numbers). So, they are also not visible in readers like ADE, Calibre or on devices. Now I have used an HTML Entities code for this in Sigil: ℕ[semicolon] (I am writing like this, because you'll see the symbol otherwise) Sigil presents it as a symbol given above, but after I save the file, Sigil automatically converts the code into a special character such as this: Attachment 167848 (alt="bold, empty N"). This character is not visible in ADE, and I do not know how to force Sigil not to convert code into spec. chars. Can this be done at all in this version?:help: Thanks Bart |
Have already found the answer here:
https://www.mobileread.com/forums/sh...d.php?t=277821 Thanks and sorry to bother you. Bart |
No problem. But keep in mind that there's really no difference between the unicode character and the html entity in terms of what ADE is capable of inherently displaying. The glyphs are either included in the available fonts on the reader (or in the app), or they are not. If you're not using embedded fonts for certain characters, using the html entity for those characters will not make them magically show up where the same unicode character fails to properly display. If the glyph is not included in ADE's default fonts, embedding is the only way you can guarantee them to display properly.
In short: Sigil changing your html entity to a unicode character is not the reason the character doesn't display in different rendering engines. They don't display because the font sets in those different engines don't include the glyph for the character you want to show. You need to embed a font that includes all the special characters you require for your content. |
Quote:
https://stixfonts.org/ They include every obscure math symbol you'll ever need. Last year someone had issues with the Alef ℵ in their math document, and I gave examples of some code + font embedding: https://www.mobileread.com/forums/sh...44#post3575244 Just make sure all your math/variables are properly marked up with classes. |
I understand that automatically converting all HTML entities to their Unicode character is more readable.
However, if we have written an HTML entity (and more specifically a Unicode Entity ) it is because we want it as it is. So seeing them disappear automatically is a bad surprise. This is especially troublesome for alternative space character ("Narrow No-Break Space" for example) and technical characters. It would be nice to have an option to disable the automatic conversion of entities completely (or partially: only Unicode Entities are kept, the other ones are converted) For me, having to ask the user to manually enter the exceptions (without any other alternative) is not a good feature. |
It is what it is. There will be no change in this regard. The manual entry of the entities you wish to preserve is the ONLY way we can do it. We didn't decide willy-nilly to convert all entities to characters. The manual method to preserve them WAS the compromise. The only alternative is having no choice whatsoever.
|
Then add them to your preseve entities setting and all will be fine. Use only numeric entities for epub3 as named entities are no longer allowed under html5.
Nothing is lost as the conversion to and from entities is a one to one mapping. Your settings will be respected. Quote:
|
Quote:
|
If I may be permitted to digress for a moment...
How are people writing up math? I initially thought MathML would be an option, but after trying a few tests with that and SVG, I ended up using MS Word and then creating a png image. I find anything more complicated than a single, in-one-line equation with some sub- or superscripts is way beyond my HTML coding capabilities. |
There was a nice gui equation editor that was part of OpenOffice/LibreOffice that allowed the user to build the equation and the save the final result to mathml. I think there was a standalone java (not javascript) program that did something similar.
Perhaps one or both still exist. They make creating svg, png, and mathml versions of an equation quite simple. Sorry I an't remember what they were called but a Google search should help. Quote:
|
Quote:
That'll get you started. Albert |
Quote:
That used LibreOffice Math -> PDF -> PNG. Nowadays, I use LaTeX -> PDF -> PNG (explained those methods further in the topic). Doing it this way will allow you to easily generate proper vector + bitmap images directly from the source. And in the future, as MathML support gets better, you have your equations sitting in a nice source format, and can (easily) convert to MathML. Note: MathML currently only works in certain readers + newer devices. You have to keep in mind all the old devices out there (and if you want Kindle, that's a no go). So you have to create these bitmap fallback images anyway. Note #2: At ebookcraft 2018, Peter Krautzberger also gave a workshop "Equations in ebooks" (Slides here, no audio/video online though). His presentation pretty much came to similar conclusions (needing all the fallbacks because of all the different/ancient devices out there). He discusses slightly different methods/tools, and compatibility tests with different readers/renderers. |
Quote:
Quote:
|
Quote:
When you open your DOCX with equations + Save As HTML, Word exports tiny PNGs. (According to what I know you don't even have control over final image size, etc.) Toxaris can probably pop in and explain details... (I think automated MathML export is only available via APIs and can't be exported directly within Word? But don't quote me on that. See "How to parse mathML in output of WordOpenXML?" on Stack Exchange.) Note: (I highly recommend reading all the articles by MurrayS3 on Microsoft's Blog about OfficeMath. He's one of the chief engineers for adding/enhancing the Equation Editor over the decades, and has all the technical details.) 2. Toxaris's EPUBTools can also export MathML+SVG for you. This is the easiest/best way I know of currently. The subpar thing about those two workflows is that all the equations will be numbered sequentially:
When you fully control the entire workflow from the start, you can give each equation human-readable names (VERY important when going to edit/add/change books in the future), and can control ALL the variables separately:
This allows you to easily regenerate whatever materials you need. Quote:
It's extremely hard to even get real info from Amazon about MathML (What are the EXACT books that use this? And how is it done?). Here's what the Digital Reader said about it in April: Quote:
From what I've gathered over the months, it only works on the Kindle for PC with NVDA installed. And it uses a horrendously outdated/buggy/crippled MathJax. And there's STILL no way to limit the files to only modern devices (or even Kindle for PC only). You'll STILL have to code all the fallbacks for normal KF8+KF7. Overall, I agree with Peter's quote above. To say "it's supported" is a joke. |
Quote:
|
| All times are GMT -4. The time now is 09:19 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.