ChatGPT seems to be trained on copyrighted books like Harry Potter

1 year ago 146

Technology

A test to see whether ChatGPT has memorised the contents of copyrighted material suggests it was trained on passages from Harry Potter, Game of Thrones and many other novels

By Chris Stokel-Walker

Harry Potter and the Philosopher’s Stone was published in 1997

Zety Akhzar/Shutterstock

ChatGPT and its successor GPT-4 appear to have memorised details from vast numbers of copyrighted books, posing questions about the legality of how these large language models (LLMs) are created.

Both artificial intelligences were developed by private firm OpenAI and trained on huge amounts of data, but exactly which texts make up this training data is unknown. To find out more, David Bamman at the University of California, Berkeley, and his colleagues looked at whether the AIs were able to …

View introductory offers

No commitment, cancel anytime*

Offer ends 14th June 2023.

*Cancel anytime within 14 days of payment to receive a refund on unserved issues.

Inclusive of applicable taxes (VAT)

or

Existing subscribers

Sign in to your account

More from New Scientist

Explore the latest news, articles and features

Popular articles

Trending New Scientist articles

Read Original