MobileRead Forums - View Single Post

Mister L · 06-10-2020, 11:47 PM

Quote:

Originally Posted by jackie_w

Which programs/apps are you referring to with "all of them worth mentioning"?

I've been using TTS on Windows and Android for several years. As far as I can tell none of the TTS progs/apps pay any attention to which tags are used (<h1-6>, , etc) in the epub HTML.

What little "context intelligence" does exist in TTS seems to me to be dependent on the Voice you use to do the speaking and I'm not aware that any of the Voices I've used changed depending on HTML tag.

If you know different, I'm all ears, because I find the lack of progress over the last decade in TTS for Joe Public's own ebooks to be depressing. Too much money to be made selling Audible subscriptions using real voice artistes.

I see no evidence that your average TTS app for ebooks doesn't do exactly that, i.e. convert to txt before speaking.

I have little to no experience with general public TTS apps and I'm thinking more about the apps specifically made for blind people who cannot use visual interfaces (for example JAWS). I'm not an expert by any means but I've been learning more about accessibility lately and from what I understand, those applications will both modify the audio presentation of the text (not necessarily with inflexion but sometimes by announcing "title" or similar before reading the text) and (possibly more importantly, for blind users) when reading html they rely on the tags for navigation. So for instance, if you have a file with several levels of headings, you can jump from heading to heading to find the section you're interested in, if the html is semantically formatted, the same way that visually you can skim a document looking for larger, bold centered text to get an overview. You can't do that if it's only p tags styled with CSS. Here is some information: https://www.boia.org/blog/how-does-j...read-web-pages

I've also encountered some youtube "news" videos (no examples come to mind unfortunately) where the voice-over text is clearly done by a machine, but doing a fairly decent job of modulating inflexions to convey meaning (for a machine; I don't expect a TTS to be nearly as subtle or expressive as a human). I have no idea what apps are used for those though and I don't know what format text they read.

Quote:

Originally Posted by jackie_w

My original point is more relevant when trying to detect a voice difference when text is wrapped in , , , tags. I can't hear any difference. Perhaps this comment would have been better suited to that other recent thread discussing the merits of if/when you should use rather than , rather than .

In those specific cases I would not necessarily expect an audible difference between em and i or strong and b, but there probably would be between em / i and or strong / b and , because the spans are meaningless to the app. Or do you mean there is no difference between plain text and text with em / strong tags? In that case I guess it's just down to how well the app can synthesize speech, I really don't know how well emphasis is handled by speech synthesizers. I don't know of any specific TTS apps which are better than others but if you use twitter you could ask on the #eprdctn discussion, there are people there who can certainly tell you if there are any that stand out amongst the general-public apps. As I said I am approaching the question more from the perspective of accessibility for cases of visual handicap.

Quote:

Originally Posted by JSWolf

I did your experiment and I was correct. Removing the CSS makes it unreadable. Why would you want to remove the CSS anyway?

Did you do the experiment fully, comparing a book *with* semantic html to a book with just styled p tags? If you did, then you saw the dramatic difference between the two, and you should hopefully understand my point now. If you didn't and only looked at a book with styled p tags, then you only confirmed that that particular book was badly made. On the other hand if by "unreadable" you simply mean "not pretty in my opinion" then we are talking about two different things and I would disagree that "not aesthetically pleasing" is the same as "unreadable".

There are plenty of reasons to disactivate CSS if it doesn't work for your particular needs / preferences. I have personally disactivated the CSS (until I could fix the file properly) in books where the side-margins were defined in ems, for example, because they became too large and didn't leave enough room for the text. You might also want to disactivate the CSS if the person who made the book used low-contrast colours on the text and you are reading on a black-and-white device or are colour blind and cannot see those colours. Maybe the person who made the book chose unreadable (or just really ugly) fonts, or made the headings so much bigger than the body text that they won't fit on the screen, or put all the notes or captions in 0.6em size... those are all examples I've personally encountered in various commercial ebooks. Like I said, there are plenty of reasons and many that would never occur to me but are dealbreakers for someone else.