Quote:
Originally Posted by lisashea
Thank you all so much for the suggestions! I will definitely install the plugin and work with that.
|
Quote:
Originally Posted by lisashea
I am a strong proponent of using styles and do use them for everything.
|
Absolutely fantastic news. Welcome to the 1%! (Rarely anyone even uses Styles.)
Quote:
Originally Posted by lisashea
In the instances I'm running up against, even with my styles, I'm not sure that there's any way for me to undo the way Word creates a filtered HTML file to have it stop doing those specific things.
|
Yep, like exaltedwombat said, that bad code is a Word "Filtered HTML" problem.
So your current method is this:
- DOCX -> Save As "Filtered HTML" -> Calibre (convert to EPUB)
What is more effective is directly going:
skipping Word's crappy HTML code!
- - -
The tool I personally use all the time is:
Toxaris's "EPUB Tools"
It exports extremely clean HTML: basic <p>, <i>, <h1>, [...]:
Code:
<h2>The Beginning</h2>
<p>It was a dark and stormy night...</p>
So all that other clear="all" + other Word junk won't even make it into your EPUB file!
You can even set EPUB Tools to carry over your Styles -> CSS.
So if your chapters use a special "chaptertitle" style, it'll appear in your EPUB as:
Code:
<h2 class="chaptertitle">The Beginning</h2>
Quote:
Originally Posted by lisashea
On the third set of challenges, the "Picture 1" and "Picture 2" and so on default tags for images, sure, I could go through every single image in the entire document and give them ALT settings. I'm just not up for that task. I have hundreds of books. Some of my books have hundreds of images for various reasons and it would take more time than it's worth. It's easier just to remove that space.
|
Well, definitely think about assigning proper alt text for current/future books.
It's extremely important for Accessibility, especially in ebooks (Text-to-Speech is just one use-case).
Proper Alt Tags
If the alt is useless gibberish:
Code:
<img alt="Picture 123" src="../Images/bullfrog.jpg" />
<img alt="img1234" src="../Images/img1234.jpg" />
it's better to strip it blank:
Code:
<img alt="" src="../Images/bullfrog.jpg" />
<img alt="" src="../Images/img1234.jpg" />
- - -
Side Note: A helpful Regex to do this in Sigil is:
Search: alt="[^"]+"
Replace: alt=""
- - -
But it's even better to write useful text (and filenames!) in the first place:
Code:
<img alt="Bullfrog jumping out of a pond." src="../Images/jumping.bullfrog.jpg" />
<img alt="A beautiful lemon meringue pie with a cherry on top." src="../Images/lemon.meringue.pie.jpg" />
This means Text-to-Speech will actually tell a blind reader WHAT'S in the photo:
"A beautiful lemon meringue pie with a cherry on top."
Where the original version would tell them:
"img1234"
Creating Alt Text
I think newest versions of Word 365 have also made it easier to assign alt text to your images:
And here is an accessibility site also explaining how/why to create good alt text:
Checking/Fixing Accessibility
Another fantastic Sigil plugin is
Access-Aide, by KevinH.
This helps create more accessible books by:
- Doing a lot of boring gruntwork for you
- Listing all the alt tags in the book
- [...].
- - -
I've also written extensively about "Accessibility in ebooks" over the years.
Here's one example in 2018 where I explained why it's important to mark the book's language properly + create good <title>s in your HTML:
Post #2+ in "Two Questions"
Sure sure, physical/printed books: "Who cares if my English book is accidentally 'French', nobody will know!"
But open that ebook on your phone, put it down while you're cooking, and all of a sudden Text-to-Speech is speaking everything with funny accents!
Then your hands are full of flour, you're trying to wash them as quickly as possible to turn the thing off... you turn around, and there's crème brûlée and croissants exploding out of the oven!