Thread: Audiobooks & AI
View Single Post
Old 05-02-2025, 05:54 PM   #33
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,473
Karma: 169115146
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by salamanderjuice View Post
Code:
description = "Laura clamly reads a book to her child. The recording is of very high quality, with the speaker's voice sounding clear and very close up."
The description is specifying a specific voice and not much else.
How do you clamly read?

Quote:
Originally Posted by salamanderjuice View Post
I've heard much worse from real people.
And so have I. OTOH, professional voice actors definitely do a better job compared to AI.

One author I know was looking into using AI for audiobooks with a lower cost than hiring professionals. What was found was that to get a decent result from AI, it was pretty much required to rewrite the book to add the pitch, tone, pace, etc. indicator elements that a voice actor did not need. Then there was the issue with having invented words pronounced correctly. For humans, they were told the pronunciation. For AI, the entire book was converted to use the IPA though most of that work was automated and about 5% of the words needed to be manually converted (basically look for words that had * around them to say they weren't found in the dictionary). It turned out to be a small group of words so the manual work wasn't that bad, mostly character and place names.

They also looked at using ssml and pronunciation lexicons but that was a lot more work since it required multiple spans and you need to add the ipa bits to the header though it did give an ebook that could be read in English.

<p><span ssml:alphabet="ipa" ssmlh="ðə ˈkʌrᵊnt steɪt ɒv ði ɑːt ɪz nɒt ʌp tuː ðæt ˈlɛvᵊl ɒv səˌfɪstɪˈkeɪʃᵊn.">The current state of the art is not up to that level of sophistication.</span></p>

An sample paragraph using ssml and IPA.
DNSB is online now   Reply With Quote