But Authors write the content and most never edit HTML.
How do you know the author's semantic intent?
There are many reasons (some obsolete) and style guides about why an author puts italics or bold. There are also two common ways each for italic and bold to be indicated by an author using a typewriter (from Victorian era till authors could afford wordprocessing in 1970s, but still used in plain text on editors).
Perhaps this is a heretical position, but almost no author will differentiate between <i> and <em>
Reasons writers indicate or use
italics (markup is _ or / surrounding word)
1. Obsolete: Italics for a foreign word.
2. Obsolete: thoughts. (Using quote marks for thoughts is against many style guides too)
3. Letters quoted
4. Obsolete: Quotes, now usually a more indented style
5. Telepathic conversation: Almost universal in SF & F for over 50 years.
6. Very rarely just empathic word in dialogue, almost never as emphasis in narration.
So if you don't know the author's intent mostly <i> is correct. In the thousands of printed novels I have and larger number read I've hardly seen any use of italics that would be emphasis and only in dialogue. I'm sure anyone can cherry pick examples.
Bold vs Strong
Again no Author considers this. From typewriter days to now using
Bold is indicated by *this is bold*, sometimes by _and_ or underling on a typewriter. I don't remember any font face called Strong.
Most style guides suggest Boldface is only used for titles and headings. Sometimes it's been used to indicate shouting, but that's very rare. It's now preferred to use ALL CAPS for shouting. Not usually
SMALL CAPS as they are specialist.
People formulating HTML specs came up with <em> and <strong> and depreciated <i> and <b>. Then they said you can use <i> and <b> but the difference is semantics. Did they ask newspapers, publishers, authors, creators of style guides?
Underline
There is a <u> tag.
Originally this was to indicate bold or a heading on typewriters or CRTs with no bold (shown as brighter on older terminals). As it's obsolete if you have bold on print or display it was chosen even before HTML was invented to indicate a hyperlink. Hyperlinks and text proposed in 1960s and used about 5 years before 1989 HTML draft. Style guides are now against it and suggest using boid.
Strikeout
<s>, <strike> or <del>?
Originally the hyphen on a typewriter served for minus or hyphen. Two meant substitute en dash and three an em dash. UK and US style guide will differ; UK uses a spaced – en dash – for an aside (when it's more appropriate than commas or true parenthesis). The US tends to use an un-spaced em dash—for an aside—in text. Both will use the em dash to indicate cut-off or interrupted speech
“I don’t think—”
“No! You never do!"
Originally struck out text was never in published material, it was only markup on a typewriter indicating to typesetter/publisher or copy typist to leave out (delete) the word. Style guides suggest it should only be used in publishing when an insert is mimicking a typewritten text. Sometimes used to indicate what the writer first thought of. Since the invention of correction systems on typewriters and editing on screens the strikeout is only used for decorative means of showing an abandoned word or idea.
Quote:
HTML <strike> Tag
Not Supported in HTML5.
The <strike> tag was used in HTML 4 to define strikethrough text.
What to use instead?
Example
Use the <del> tag to define deleted text:
<p>My favorite color is <del>blue</del> <ins>red</ins>!</p>
Example
Use the <s> tag to mark up text that is no longer correct:
<p><s>My car is blue.</s></p>
|
The <del> and <s> distinction is thus nearly useless. Real authors, writers, journalists don't usually edit HTML. Extremely rarely, for the reason of indicating a change of mind the author wants a struck out word.
The Real World
Most writing is done with a wordprocessor. If the author chooses
bold or
italic for whatever reason, the automatic conversion to an ebook or web page will use <b> and <i>, annd bold faced or italic faced fonts are rendered. Though usually bold is only part of a heading style.
The underline decoration is automatic styling on wordprocessor for links which in HTML are links with no decoration style.
The Strikeout is incredibly rare.
If the entire paragraph is a bold or italic style the automatic conversion doesn't use <b> or <i> but font-weight: bold; and font-style: italic; in the CSS. If it's part of a normal paragraph then <b> and <i> are used
The <s> vs <strike> vs <del> is moot. No matter if an entire styled paragraph or a word only a <p style="whatever"> or <span style="something" is used. The CSS then has text-decoration: line-through;
Underline generates a style in CSS text-decoration: underline; never any inline HTML tags; only a <p style="whatever"> or <span style="something" is used depending on if a paragraph or word.
<em>, <strong>, <u>, <s>, <strike> and <del> don't seem to ever be used in any automated conversion of wordprocess to ebook or HTML if the conversion is using CSS.
So in the real world for novelists, journalists, report writers etc the arguments about semantics and which tags to use are irrelevant.
Few people creating actual written content hand craft HTML. It's produced by imports CMS, Indesign, Calibre, Sigil etc. Or by the modern equivalent of typesetters editing HTML & CSS, but how do they know the authors intention?
Most TTS is on phones or tablets. Is it going to make much difference if <i> vs <em> or <b> vs <strong> is used?
The <s> vs <del> argument seems irrelevant.
Underlines should only be automatic and for links. They originated due to lack of visible bold on a screen or typewriter.