View Single Post
Old 12-01-2019, 10:23 AM   #5
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
@ezdiy Listening to the audio samples and especially the failures of Tacotron2, I do realize that there is rather a lot of room for improvement!

It seems NVIDIA published a tacotron2 version without wavenet. Would it be possible to couple it to a less computationally intensive synthesizer? They apparently have tensor cores dedicated to their Waveglow synthesizer. So it seems unfeasible to try and implement that on the pocketbook. Another possibility could be Mamah's implementation.

What about Polly? It seems Amazon asks for a subscription fee to use that. Are there ways around that? Or alternatives? How about using your own NAS as a server for the sound processing?

@Tarana Good to hear that there is a reasonably small learning curve. Too bad fantasy is harder. I already have problems with understanding names in real life (no context, just an unintelligible sound), so I'll probably never understand the TTS system.

Last edited by Markismus; 12-01-2019 at 03:23 PM.
Markismus is offline   Reply With Quote