Thread: Audiobooks & AI
View Single Post
Old 04-25-2025, 07:41 AM   #11
salamanderjuice
Guru
salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.salamanderjuice ought to be getting tired of karma fortunes by now.
 
Posts: 981
Karma: 13558066
Join Date: Jul 2017
Device: Boox Nova 2
Quote:
Originally Posted by Quoth View Post
The "modern stuff" is not really a lot better than the best 20 to 30 years ago. Sometimes a little better. Degrades quickly if not standard American English.

The good TTS engine was bundled by Hauwei. An SCL-L01 apparently for the Polish market but sold as NOS in Ireland. Pop-in battery, 3.5mm jack socket, SD-Card & SIM slots accessible when battery (cell) popped out. The 2200 mAH cell dated Dec. 2016 (in ISO format). 720 x 1280 pixels. Android 5.1.1

The "flaw" of the mid 1980s TTS I had was it worked best with a custom text file. You could spell phonetically and also add voice modifiers.


I wonder what exactly they mean by AI? Just more real speech sampled? I've an early IC that works with ASCII (and commands) on a simple micro-controller. It's certainly poor, but with a suitable text file not much worse than the DXG or Kindle gen3 Keyboard, or USB audio stuck on a Paperwhite 3. There were better PC TTS on XP 8 years before DXG. XP was decent by 2002. About the same time as Apple Mac OS9 was replaced by much better OSX (based on NeXt Step, based on BSD).

The current Google "AI" effort is only better than PW3 TTS when using standard USA English texts. Picking a different voice doesn't fix non-UAS English.
It's AI because they use neural networks rather than just stringing together phonemes like past approaches (e.g., https://arxiv.org/pdf/1809.08895).

And it's hard disagree from me. The old TTS of 20-30 years ago is awful in comparison. Listen to the Microsoft SAM SAPI5 example on this Wikipedia page: https://en.wikipedia.org/wiki/Micros...-speech_voices. That's what XP was doing. It's not good compared to the more modern Google Android TTS and way way worse than the "AI" approaches.
salamanderjuice is offline   Reply With Quote