View Single Post
Old 06-21-2020, 09:31 PM   #96
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Quoth View Post
No, I do get what the difference between <i> and <em> is. I think I'd not use <em> much, but unless I can do wordprocessor source that causes <em>some text</em> to be created, there will only be <i>. Similarly with <b> and <strong>.
Word/LibreOffice as Input (Character Styles)

If you used a Character Style for emphasis, you might get something closer to my example above:

Code:
In <i>Book Title</i>, the character said: "Not in <span class="emphasis">my</span> house."
Not ideal, but towards the right direction.

In Fiction, Styles are also important when marking inner thoughts or "telepathic speaking":

Code:
<span class="innerthought">Wow, I did <em>not</em> do good at all.</span>
Note: Italics within italics = not italic (Normal/Roman).

What usually happens is you get this in your Word->HTML:

Code:
<i>Wow, I did</i> not <i>do good at all.</i>
Ultimately, you would want to aim towards HTML like this:

Code:
<i class="innerthought">Wow, I did <em>not</em> do good at all.</i>
plus this tiny line in your CSS:

Code:
i em { font-style: normal; }
Quote:
Originally Posted by Quoth View Post
In an absolute sense true. But if the aim is to automatically go from Wordprocessor source to ebook via a conversion tool (such as Calibre rather than Indesign or Sigil or Amazon something which are more like DTP for ebooks), then what you want is styles and characters that consistently convert to the same sensible XHTML+CSS.
Save As, general conversion tools, or "automatic" one-button press "solutions"... you're probably going to get anything italics:
  • turned into 100% <i> or 100% <em>
    • See many CMSes/tools.
  • output as unreadable-by-human names + different every book
    • <span class="CharOverride-2"> + <span class="calibre123">
      • (I'm looking at you, InDesign, Calibre, [...])
    • Also see hideous code like JSWolf's "Really Bad CSS" topic.

If you use Styles though, you can take advantage of things like Style Mapping (see InDesign tutorial + help file showing it off). This allows you to directly say:

Change my "Heading-2" style -> <h2> and give it class="XYZ" in the EPUB.

Side Note: There are also tools that do this for Word->InDesign. I definitely wish more tools (especially Calibre) had Style Mapping. Then you could see a list and help nudge Word italics -> <i> + an "emphasis" Style -> <em>.

If you start getting into more specialized conversion tools:

Toxaris's EPUB Tools can export Word Styles -> HTML Classes. It also allows you to choose between <i> or <em> for italics on export. An option like this at least gives some wiggleroom.

Mammoth can be used by those who come up with consistent Styling. This is a much more specialized workflow, but those who (professionally) create many documents can apply more rigid standards.

Quote:
Originally Posted by Quoth View Post
Unless I outsource final ebook creation to someone else. Then I suppose I can use a character style for <em> and one for <strong> and tell the ebook making guru what those mean.
Again, this is why me+Hitch have been stressing Styles as the #1 thing authors can learn to do.

If you have a clean, well-maintained source document, everyone's life becomes easier/faster. It's at the very core of any future steps.

Quote:
Originally Posted by Hitch View Post
AND, if that happens, if someone who actually uses Word correctly creates a named style, it will emerge from formatting as a p style or a span, and there endeth the problem.
Now there's the real unicorn. As you say, you could count on one hand the number of authors you've run across (at BookNook) actually using Styles properly.

Luckily, I've been getting slightly better luck in my "Styles training".

Quote:
Originally Posted by Quoth View Post
Of course the Interobang has never caught on. I've only seen it ever on the Internet in articles discussing it and in ONE book covering punctuation. I've not seen it in any ebook, or any printed novel ever.
Check out those articles+podcast I linked in the Reddit post. It's good stuff, good stuff.

Also, you probably wouldn't get along in Physics or Maths... there's a ton of weird, obscure punctuation and symbol usage there.

Quote:
Originally Posted by Hitch View Post
Yes, exactly. Speaking as a formatted, how the hell would I know which emphasized "i" text from Word is meant to be italics, and which are meant to be "emphasized" as in em? Jesus wept!
Hmmm... this may be the one case where Non-Fiction seems much easier. Fiction... maybe not so much. :P

But recently I've been thinking of ways to mark things up, mostly by messing around with my "Non-Linear Editing" ways.

Quicker and More Accurate Tagging

For example, ripping every single <i> out and sorting into an alphabetical list:

Code:
<i>Enciclopedia Italiana</i>
<i>New York Times</i>
<i>Volksgemeinschaft</i>
<i>Wall Street Journal</i>
<i>Washington Post</i>
<i>individual</i>
<i>laissez-faire</i>
<i>negative</i>
From a glance, you can usually tell which ones are meant to be <i> (newspapers, book titles, foreign words/terms) and which ones are <em> (individual words).

You could also make pretty decent assumptions like:
  • In Non-Fiction, emphasis is probably not going to be in a Bibliography/References page.
  • In Fiction, a single word is probably <em> + an entire sentence is more likely <i>.

Real-Life Applications: Over the past two years, I've used sorted "italic lists" to catch hundreds of typos/inconsistencies.

Latest journal I've been working on, I actually marked up proper HTML lang, similar to find all 'foreign words' method I discussed last year in "Export list of words in spellcheck". Benefits were fantastic (multi-language spellchecking + so many less red squigglies).

And last year, I used a similar non-linear method to clean up an entire book's citations. The citations were a mix of many different Style Guides (think copied/pasted exactly as is out of dozens of history books). I pulled all citations, then converted it into a giant spreadsheet (Author/Title/Year/[...]). Imported this into a Citation Management program, and was able to remove duplicates + re-export consistently-styled citations.

Proper markup is key, since information gets ordered and formatted differently depending on if it's a book, newspaper, journal article, etc.

... Further information will probably be supplied in future blog posts.

Quote:
Originally Posted by Hitch View Post
And...in which fonts would we find that, she wondered‽ In my world, lads and ladies, if the character doesn't exist in fonts, it might as well be a Unicorn.
If it exists in Unicode, Android, billions of people, I'm for it. Who's with me‽

PS. A few months back I actually searched the "middle finger" character on the entire MobileRead... I was the only one who ever used one. So just because a character's "never been seen", doesn't mean it shouldn't be used!

Last edited by Tex2002ans; 06-21-2020 at 11:52 PM.
Tex2002ans is offline   Reply With Quote