No idea about the commercial side. Like mentioned in
Post #6, there is the yearly
Interspeech conference.
That's where a lot of the bleeding-edge audio generation research gets discussed.
And so much of the higher-quality TTS has shifted towards cloud-based, then charge users per word.
(I believe tools like Balabolka exploit the free "demo" sections on Amazon Polly [IBM, Microsoft, etc.], by sending small snippets of text. No idea if you get rate-limited or what when feeding entire books in there. Usually those demos limit you to a few hundred characters at a time.)