MobileRead Forums - View Single Post - Understandability Text-to-speech

ezdiy · 11-30-2019, 03:59 PM

Your complaint is EXTREMELY COMMON with PRIMITIVE synthesis like IVONA. Yes, it's the SAME GIRL who reads Alexa. She sounds SO PRETTY, but there are BAD NEWS. Turns out Ivona is RETARDED, as she never understood PROSODY. Probably because it was never MARKED EXPLICITLY in books like I'm doing it RIGHT NOW.

There are far more clever TTS systems, the recent ones are Polly and Tacotron. Tacotron is opensource, including (reasonably useful) pretrained models, I think Polly is cloud only or something. TTS on pocketbook are pluggable, that is each installed voice provides their own libttsengine.so exposing ABI of https://github.com/blchinezu/pocketb...de/ttsengine.h

The reader then calls that when reading with that installed voice. If you were to go and implement state of the art TTS like Tacotron, you'd need to implement this wrapper library to glue it together. Currently, the wavenet synthesizer is research grade (you need to run whole tensorflow to evaluate the model), so I'm not sure PB would have enough horsepower to run it.

11-30-2019, 03:59 PM	#2
ezdiy Zealot Posts: 121 Karma: 156515 Join Date: Oct 2019 Device: KT, KPW4, PB740-2	Your complaint is EXTREMELY COMMON with PRIMITIVE synthesis like IVONA. Yes, it's the SAME GIRL who reads Alexa. She sounds SO PRETTY, but there are BAD NEWS. Turns out Ivona is RETARDED, as she never understood PROSODY. Probably because it was never MARKED EXPLICITLY in books like I'm doing it RIGHT NOW. There are far more clever TTS systems, the recent ones are Polly and Tacotron. Tacotron is opensource, including (reasonably useful) pretrained models, I think Polly is cloud only or something. TTS on pocketbook are pluggable, that is each installed voice provides their own libttsengine.so exposing ABI of https://github.com/blchinezu/pocketb...de/ttsengine.h The reader then calls that when reading with that installed voice. If you were to go and implement state of the art TTS like Tacotron, you'd need to implement this wrapper library to glue it together. Currently, the wavenet synthesizer is research grade (you need to run whole tensorflow to evaluate the model), so I'm not sure PB would have enough horsepower to run it.