AI trained on YouTube and podcasts speaks with ums and ahs

1 year ago 86

Technology

An artificial intelligence that has been trained on YouTube and podcast recordings generates speech from text prompts that sounds remarkably natural

By Alex Wilkins

Calendar icon

9 March 2023

Image of digital waveforms

An AI can generate more natural-sounding synthetic speech by including pauses

Shutterstock/PrinceOfLove

Generating speech with different rhythms and pauses makes it sound more human-like, according to an assessment of an artificial intelligence trained on speech taken from YouTube and podcasts.

Most artificial intelligence text-to-speech systems are trained on data sets of acted speech, which can lead to the output sounding stilted and one-dimensional. More natural speech often displays a wide range of rhythms and patterns to convey different meanings and emotions.

Now, Alexander Rudnicky at Carnegie Mellon University in Pittsburgh, Pennsylvania, …

View introductory offers

No commitment, cancel anytime*

Offer ends 14th April 2023.

*Cancel anytime within 14 days of payment to receive a refund on unserved issues.

Inclusive of applicable taxes (VAT)

or

Existing subscribers

Sign in to your account

More from New Scientist

Explore the latest news, articles and features

Popular articles

Trending New Scientist articles

Read Original