How ChatGPT and Other LLMs Work—and Where They Could Go Next

1 year ago 178

AI-powered chatbots such as ChatGPT and Google Bard are certainly having a moment—the next generation of conversational software tools promise to do everything from taking over our web searches to producing an endless supply of creative literature to remembering all the world's knowledge so we don't have to.

ChatGPT, Google Bard, and other bots like them, are examples of large language models, or LLMs, and it's worth digging into how they work. It means you'll be able to better make use of them, and have a better appreciation of what they're good at (and what they really shouldn't be trusted with).

Like a lot of artificial intelligence systems—like the ones designed to recognize your voice or generate cat pictures—LLMs are trained on huge amounts of data. The companies behind them have been rather circumspect when it comes to revealing where exactly that data comes from, but there are certain clues we can look at.

For example, the research paper introducing the LaMDA (Language Model for Dialogue Applications) model, which Bard is built on, mentions Wikipedia, “public forums,” and “code documents from sites related to programming like Q&A sites, tutorials, etc.” Meanwhile, Reddit wants to start charging for access to its 18 years of text conversations, and StackOverflow just announced plans to start charging as well. The implication here is that LLMs have been making extensive use of both sites up until this point as sources, entirely for free and on the backs of the people who built and used those resources. It's clear that a lot of what's publicly available on the web has been scraped and analyzed by LLMs.

LLMs use a combination of machine learning and human input.

OpenAI via David Nield

All of this text data, wherever it comes from, is processed through a neural network, a commonly used type of AI engine made up of multiple nodes and layers. These networks continually adjust the way they interpret and make sense of data based on a host of factors, including the results of previous trial and error. Most LLMs use a specific neural network architecture called a transformer, which has some tricks particularly suited to language processing. (That GPT after Chat stands for Generative Pretrained Transformer.)

Specifically, a transformer can read vast amounts of text, spot patterns in how words and phrases relate to each other, and then make predictions about what words should come next. You may have heard LLMs being compared to supercharged autocorrect engines, and that's actually not too far off the mark: ChatGPT and Bard don't really “know” anything, but they are very good at figuring out which word follows another, which starts to look like real thought and creativity when it gets to an advanced enough stage.

One of the key innovations of these transformers is the self-attention mechanism. It's difficult to explain in a paragraph, but in essence it means words in a sentence aren't considered in isolation, but also in relation to each other in a variety of sophisticated ways. It allows for a greater level of comprehension than would otherwise be possible.

There is some randomness and variation built into the code, which is why you won't get the same response from a transformer chatbot every time. This autocorrect idea also explains how errors can creep in. On a fundamental level, ChatGPT and Google Bard don't know what's accurate and what isn't. They're looking for responses that seem plausible and natural, and that match up with the data they've been trained on.

Read Original