MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Why, Sigil, Why? And can I change you? (https://www.mobileread.com/forums/showthread.php?t=341176)

hobnail 08-23-2021 06:12 PM

Quote:

Originally Posted by Sarmat89 (Post 4148542)
H tags are useless for headings: they were designed for technical documentation "2.1.1 Blah blah" and not for books.

Right, but I thought they were also for making text bold? :D

Binchen 08-23-2021 06:15 PM

Quote:

Originally Posted by Hitch (Post 4148665)
Took the words right out of my mouth.

Now i have a catchy tune :rolleyes:

Tex2002ans 08-23-2021 08:04 PM

Quote:

Originally Posted by Notjohn (Post 4148600)
Most writers emphasize with italics, and reserve bold for the most minor sort of breakhead.

... in English.

That's another reason why <em> was introduced:

English (and most Latin-alphabets) tend to use italics for emphasis.

But Asian languages don't have such a thing as italics. They emphasize using dots (and other symbols).

As the internet spread across the world, there were more and more cases where <i> and <b> failed.

Side Note #1: For emphasis, along with italics, there's been lots of different display types:
  • bold
  • letter-spacing
  • highlight
  • colors
  • [...]

For a little more info, see Wikipedia: "Emphasis (Typography)".

But you can clearly see how E M P H A S I S is a distinct category from plain old italics.

Side Note #2: A lot of the multi-lingual stuff was being discussed at the end of the very last thread:

"<i>, <em> or <span> for italics ?" (Post #151+)

There was a TUG talk given in 2020: "Typographical expression of emotions in a variety of alphabet systems"... I should re-contact them and see if they ever posted the lecture online. Sadly, I signed up for the conference, but missed the livestream of that talk.

Quote:

Originally Posted by Notjohn (Post 4148600)
To me,it's a distinction without a difference. I prefer i and b because they do what they say, while em and strong are ambiguous.

<em> is not ambiguous.

But I agree, HTML5's explanation of <strong> is a little... ehhhh...

Quote:

Originally Posted by Notjohn (Post 4148600)
Lots of html is nutty. For example, the claimed distinction between an apostrophe and a single close-quote. One shrugs one's shoulders and gets on with it.

Again, multi-lingual.

See the fantastic article me+Toxaris (and others) always point to:

Wikipedia: "Quotation mark" > Summary Table

Around the world, there's every single combination under the sun for quote marks.

Just so happens to be, in English, the "apostrophe" + "right single quote" (single "close quote") settled on using the same-looking symbol.

Just like it just so happens to be, in English, emphasis settled on italics. :D

Quote:

Originally Posted by AlanHK (Post 4148558)
If you're doing scientific articles in epub, all respect to you, but that's a pretty esoteric market.

Not necessarily. The academic + scientific market is pretty huge. Minority, yes, but large nonetheless.

And it's an example where lots of proper markup is used. Just because you haven't seen it, doesn't mean it's not out there (and should be strived towards).

Quote:

Originally Posted by AlanHK (Post 4148558)
I meant, they have never discussed repurposing the text (except from print to ebook, or for webpages, which is trivial) or converting to audio automatically -- for audiobooks, they use a human reader, which is by far the best result.

They just care what it looks like.

And when their crappily-designed, not-following-the-standards, "I just care about the looks" ebook breaks a few years down the line? You'll be back to cleaning it up. :D

A lot of my work is cleaning up 10+-year old EPUBs that were poorly converted, bringing them up to date + following the latest standards.

Another large chunk is compilations—taking chapters from multiple books, combine into a single book. (When you combine books, you typically have to normalize the texts so they are consistent within themselves. [Like all UK spelling/quotes -> US spelling/quotes.])

Or releasing a single, larger chapter as a standalone.

(My latest 3 projects were 1 in each of these categories!)

Quote:

Originally Posted by AlanHK (Post 4148558)
And I have never, in 30 years, had an author who was able to use Word styles usefully. Trying to educate them about XML is just unthinkable.

Well, that's your issue. I've successfully converted quite a few authors into Styles after teaching them how much better it is.

(Seriously, it only takes ~30 minutes to watch a few of those Styles videos... and it'll save any author TONS of time.)

Yes, you'll still have 99% of the crowd mindlessly clicking the BOLD + CENTER + FONT SIZE buttons (hundreds of times)... but once you unleash the power of Styles, wow.

See my most recent Styles post on Reddit:

and my Styles posts in 2019/2020:

If it's people who will be working with me over the years... Styles just saved them+me hundreds of hours of work. And the more they plan on writing, the more and more time Styles will save. :)

There's absolutely no reason NOT to use Styles!

And if I can take it down from 99% to 98% people, wow... take one down, pass it around, 98% who don't use Styles on the wall! :D

Quote:

Originally Posted by AlanHK (Post 4148558)
I learnt HTML back in the 90s, did it for years before I learned any CSS, so I use all those as appropriate. And I much prefer e.g. just <h2> for chapter heads than a p or div class; not least because it more or less works with no CSS and lets me generate a TOC in Sigil.

And another aim of proper markup is progressive enhancement.

For example, I always converted all tables into actual <table>, but I had no idea about <thead> until recently.

<thead> allows long tables (or tables that get split across pages) to carry headings over + allows the computer to understand main headings (instead of trying to guess).

Accessibility Ranking

So, let's say we were going from worst -> best.

An image of a table is worst:

Code:

<img alt="" src="../Table1.png"/>
Adding an alt tag is better:

Code:

<img alt="Table 1" src="../Table1.png"/>
Converting to <table> is much better:

Code:

<table>
        <tr>
                <th>Name</th>
                <th>Age</th>
        </tr>
        <tr>
                <td>Tex</td>
                <td>999</td>
        </tr>
        <tr>
                <td>Example</td>
                <td>123</td>
        </tr>
</table>

and this is best:

Code:

<table>
<thead>
        <tr>
                <th scope="col">Name</th>
                <th scope="col">Age</th>
        </tr>
</thead>
<tbody>
        <tr>
                <td>Tex</td>
                <td>999</td>
        </tr>
        <tr>
                <td>Example</td>
                <td>123</td>
        </tr>
</tbody>
</table>

If we use a % Accessibility score, you might have:
  • 0%: <img>
    • A blind person (and/or search engine) will have zero clue what this image is.
  • 25%: <img> with alt
  • 90%: <table>
    • Text-to-Speech can actually read all the data.
    • A blind reader can navigate all the data, going forward/back if needed.
    • You can make the font as large as you need.
  • 100%: <table> with <thead> + scope
    • Everything will be read out loud correctly (like a human would)
    • + it works across split pages (very common on small cellphone screens)!

So, if we look at the ebook as a whole, that little list I gave in Post #21 are the major fixes, and probably gets you 75% of the way there:
  • Headings allow users to easily jump around the book
  • Tables allow all data to be read
  • lang allows spelling/dictionary-lookup/hyphenation/text-to-speech to be correct.
  • [...]

Then each of those little Accessibility/markup enhancements slowly add up... taking care of edge cases + alternate ways of reading.

The code in the ebook is much more than just surface visuals.

Quote:

Originally Posted by Turtle91 (Post 4148654)
I would suggest that shows a complete lack of awareness about people with accessibility issues. I was there not too long ago, then I was forced to become aware… and now I care. Don’t feel lonely though - there are a lot of people that are unaware and don’t care.

The distinction is that the reading devices/apps can, and do, treat them (b/i, em/strong) differently. So, as a professional, it would be incumbent upon us to do things the right way, rather than not.

:thumbsup::thumbsup:

Sarmat89 08-23-2021 11:17 PM

Quote:

Originally Posted by DiapDealer (Post 4148563)
Hogwash.

Nope. A heading in fiction is a block of several paragraphs. A heading in HTML is a single paragraph. They just don't mix.

The only possible use is creating a fake H# element with a manually generated title. (Yes, in fiction, you cannot derive the TOC text from the header automatically).

Hitch 08-23-2021 11:21 PM

Quote:

Originally Posted by Sarmat89 (Post 4148735)
Nope. A heading in fiction is a block of several paragraphs. A heading in HTML is a single paragraph. They just don't mix.

That's an epigraph or epigram. And there's no law that says a heading in HTML has to be a single line or paragraph, either.

Quote:

The only possible use is creating a fake H# element with a manually generated title. (Yes, in fiction, you cannot derive the TOC text from the header automatically).
That's ridiculous. Of course you can. WHAT are you talking about? What is this about "a heading in fiction is a block of several paragraphs" stuff? What fiction are you talking about?

Hitch

Sarmat89 08-24-2021 12:00 AM

Quote:

Originally Posted by Hitch (Post 4148738)
there's no law that says a heading in HTML has to be a single line or paragraph, either.

Block model says so. H# elements can only contain runs, not paragraphs or other blocks.

A heading in fiction has several block parts forming an indivisible unit: the chapter text (without a final punctuation), like "Chapter VII", the chapter illustration, the chapter name (with the final punctuation in some languages), and the chapter synopsis (which can sometimes be present in the TOC). The TOC text is different from the actual header: it can be a simple list (like in The Lord of the Rings), or the numbering/name may be different, the punctuation will usually be present, and the chapter name punctuation will be absent.

The text editors or macrolanguages like Latex can handle that easily, but not HTML, a dumb language for technical documentation.

Tex2002ans 08-24-2021 01:00 AM

Quote:

Originally Posted by Sarmat89 (Post 4148746)
[...] The text editors or macrolanguages like Latex can handle that easily, but not HTML, a dumb language for technical documentation.

Sigil can generate different TOC text:

Code:

<h2 title="II. The Storm">CHAPTER II<br/>THE STORM</h2>
Quote:

Originally Posted by Sarmat89 (Post 4148746)
[...] but not HTML, a dumb language for technical documentation.

Ahh yes, that "dumb language" that controls the entire interwebs... but not Fiction books though! :rofl:

Binchen 08-24-2021 02:44 AM

Quote:

Originally Posted by Tex2002ans (Post 4148695)
An image of a table is worst:

Code:

<img alt="" src="../Table1.png"/>
Adding an alt tag is better:

Code:

<img alt="Table 1" src="../Table1.png"/>

I agree on all points, but the example above is not a good one. The alt text should contain something meaningful when the image is not displayed, which is the case with text-to-speech. Here a good software would use the alt-text, and "Table 1" is now as meaningless as nothing. In the last case you can always see that there is something, but that's it.

You can see it especially well in initials, where the first letter of the chapter is represented by a picture. I have seen it many times: the alt text then reads "Letter A" (assuming the image shows letter A). This is nonsense. The correct alt text is "A", not "Letter A" or "Initial A" or anything else. If I have the publisher's logo on the cover, then the correct alt text is "Raqndom House", "Penguin Books" or whatever the publisher is called, but neither the name of the image "publisher-logo.png" or "logo" or anything like that makes sense.

In your case "The e-book creator did not care about disabled people" should be in the title attribute or further hints in the longdesc attribute, where in the longdesc attribute only URIs may occur. EReaders certainly don't evaluate this, whether reading software does it I don't know.

For descriptive content the alt-attribute is the wrong place.

Sarmat89 08-24-2021 02:52 AM

Quote:

Originally Posted by Tex2002ans (Post 4148750)
Sigil can generate different TOC text

The USER can generate different TOC text. H# elements do not help here.
Quote:

Ahh yes, that "dumb language" that controls the entire interwebs... but not Fiction books though!
This page, for example, contains exactly 0 H# elements, which says lots about its usefulness.

Binchen 08-24-2021 03:49 AM

Quote:

Originally Posted by Sarmat89 (Post 4148762)
This page, for example, contains exactly 0 H# elements, which says lots about its usefulness.

Mistakes made by others are no excuse for your own mistakes. Bad examples are not arguments for one's own opinion. References to the mistakes of others are distracting whataboutism.

Turtle91 08-24-2021 09:19 AM

Sarmat, I'm not sure if you honestly believe what you are saying, have little/no experience with ebooks, or are just yanking our chain???

I regularly use a heading tag for the "block groups" you are talking about:

Code:

h3 {font-weight:bold; text-align:center; font-size:1.3em; font-family:serif}
h3 img {display:block; margin:1em auto; width:50%; max-width:600px}
h3 span {display:block; font-variant:small-caps; font-size:1.1em}

<h3 title="1 - Having a Little Fun">Chapter 1
<img alt="Gryffindor Crest" src="../Images/ChHd.png"/>
<span>Having a Little Fun</span></h3>

That gives a title for the auto generated-ToC, a "Chapter #" on the first line, an image for a chapter header on the second line, and a chapter name/subtitle on the third line.
It allows proper hierarchy for the different sections of the book:
Code:

<h1> - Cover/Title
  <h2> - Part, Section, or Book name
      <h3> - Chapter, Appendix, or glossary
        <h4> - Subchapter, Appendix/Glossary sections, etc.

It also follows accessibility guidance...

For you to say that heading tags are not used, or are useless, is simply incorrect (hogwash). You may say that YOU don't use them but that is about all...

Hitch 08-24-2021 09:48 AM

Quote:

Originally Posted by Sarmat89 (Post 4148746)
Block model says so. H# elements can only contain runs, not paragraphs or other blocks.

A heading in fiction has several block parts forming an indivisible unit: the chapter text (without a final punctuation), like "Chapter VII", the chapter illustration, the chapter name (with the final punctuation in some languages), and the chapter synopsis (which can sometimes be present in the TOC). The TOC text is different from the actual header: it can be a simple list (like in The Lord of the Rings), or the numbering/name may be different, the punctuation will usually be present, and the chapter name punctuation will be absent.

The text editors or macrolanguages like Latex can handle that easily, but not HTML, a dumb language for technical documentation.

Well, it's nice to know that accessibility is a bridge too far for you.

YOU are deciding that "a heading in fiction consists of several block parts forming an indivisible unit." That's YOUR choice. That isn't a rule and it's not law and it's not anything but your opinion.

If you wish to hamper yourself thus, that's your problem. The rest of us manage to muddle along quite nicely with HTML, and other uses (like pretty much any word processor on the planet, which also wouldn't recognize 3-4-5 "block units" as "a" heading, either...)

Hitch

DiapDealer 08-24-2021 09:58 AM

Sarmat89 is quite often the sole voice of authority on whatever topic they deign to comment on.

Engagement is futile. "Hogwash" suffices. Lets move on.

Hitch 08-24-2021 12:10 PM

Quote:

Originally Posted by DiapDealer (Post 4148842)
Sarmat89 is quite often the sole voice of authority on whatever topic they deign to comment on.

Engagement is futile. "Hogwash" suffices. Lets move on.

Ah, okay. I don't believe that I'd previously noticed that. Thanks for the enlightenment. I was, for a short moment there, befuddled by the Proclamation From On High.

Hitch

Sarmat89 08-24-2021 01:03 PM

Quote:

Originally Posted by Hitch (Post 4148839)
YOU are deciding that "a heading in fiction consists of several block parts forming an indivisible unit." That's YOUR choice.

It is a universal polygraphic tradition, found in every book (and ebook) out there.
Quote:

like pretty much any word processor on the planet, which also wouldn't recognize 3-4-5 "block units" as "a" heading
It doesn't have to: text processors are not limited to HTML headings, can use any sequence of styles, and have sophisticated means to generate TOC.


All times are GMT -4. The time now is 09:19 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.