MobileRead Forums - View Single Post - Understandability Text-to-speech

ezdiy · 11-30-2019, 10:19 PM

Quote:

Originally Posted by Tarana

I use the text-to-speech both with my Kindle Keyboards and Alexa. Takes about 3 chapters to get into the cadence, but probably 2-3 books before it took no more effort to listen than with a live speaker. The text-to-speech on the Echo is better than what is on the Fire (which is a marked improvement over the Kindle Keyboard). It may also depend on what you listen to. Fantasy doesn't work so well due to all the weird names. Murder mysteries and cozies work pretty well.

If I remember correctly, Echo now uses Polly. It's not available on kindles because part of the synthesis runs on amazon servers and kindle is offline most of the time for battery life's sake. Note that polly is not particularly suitable for books, because it is mainly designed for "robot newcaster" where the text itself is machine produced, including prosody, breath etc SSML tags.

Tacotron on the other hand is "black box" algorithm. For it to read certain genre well, you feed it audiobooks as a source material, and it can learn prosody on its own. Even if it is fed neutral and generalist corpus devoid of "personality" typical to audiobook performers, the results are extremely lifelike - https://google.github.io/tacotron/pu...ion/index.html