Technology
An artificial intelligence that has been trained on YouTube and podcast recordings generates speech from text prompts that sounds remarkably natural
By Alex Wilkins
9 March 2023
Generating speech with different rhythms and pauses makes it sound more human-like, according to an assessment of an artificial intelligence trained on speech taken from YouTube and podcasts.
Most artificial intelligence text-to-speech systems are trained on data sets of acted speech, which can lead to the output sounding stilted and one-dimensional. More natural speech often displays a wide range of rhythms and patterns to convey different meanings and emotions.
Now, Alexander Rudnicky at Carnegie Mellon University in Pittsburgh, Pennsylvania, …
View introductory offers
No commitment, cancel anytime*
Offer ends 14th April 2023.
*Cancel anytime within 14 days of payment to receive a refund on unserved issues.
Inclusive of applicable taxes (VAT)
or
Existing subscribers
More from New Scientist
Explore the latest news, articles and features
Popular articles
Trending New Scientist articles