When you ask an AI chatbot like ChatGPT, Claude, Copilot or Gemini to do something, it may seem like you’re interacting with a person.
But you’re not. These chatbots don’t actually understand the meaning of words the way we do. Instead, they’re the interface we use to interact with large language models, or LLMs. This underlying technology is trained to recognize how words are used and which words frequently appear together, so it can predict future words, sentences or paragraphs.
Generative AI tools are constantly refining their understanding of words to make better predictions. Some, including Google’s Lumiere and OpenAI’s Sora, are even learning to generate images, video and audio.
It’s all part of a constant flux of one-upmanship kicked off by ChatGPT’s introduction in late 2022, followed by the arrival of Microsoft’s AI-enhanced Bing search and Google’s Bard (now Gemini). Over the ensuing months, Microsoft introduced Copilot, Meta updated Llama, OpenAI released Dall-E 3 and GPT-4 Turbo, Google announced Gemini Ultra 1.0 and teased Gemini 1.5 Pro, while Anthropic debuted Claude 3. Google and Adobe have released peeks at tools that can generate virtual games and music to show consumers where the technology is headed.
Cutting-edge technology like this has arguably never been so accessible. And the companies developing it are eager to lure you into their ecosystems and to stake their claims in a market projected to be worth $1.3 trillion by 2032.
If you’re wondering what LLMs have to do with AI, this explainer is for you. (And be sure to check out our new AI Atlas guide for hands-on product reviews, as well as news, tips, video and more.)
What is a language model?
You can think of a language model as a soothsayer for words.
“A language model is something that tries to predict what language looks like that humans produce,” said Mark Riedl, professor in the Georgia Tech School of Interactive Computing and associate director of the Georgia Tech Machine Learning Center. “What makes something a language model is whether it can predict future words given previous words.”
This is the basis of autocomplete functionality when you’re texting, as well as AI chatbots.
What is a large language model?
A large language model is, by definition, a big language model.
How big?
These models are measured in what is known as “parameters.”
What’s a parameter?
Well, LLMs use neural networks, which are machine learning models that take an input and perform mathematical calculations to produce an output. The number of variables in these computations are parameters. A large language model can have 1 billion parameters or more.
“We know that they’re large when they produce a full paragraph of coherent fluid text,” Riedl said.
Is there such a thing as a small language model?
Yes. Tech companies like Microsoft are rolling out smaller models, designed specifically for phones and PCs, that don’t require the same computing resources as an LLM but nevertheless help users tap into the power of generative AI.
What is under the hood of a large language model?
When Anthropic mapped the “mind” of its Claude 3.0 Sonnet large language model, it found each internal state, or “what the model is ‘thinking’ before writing its response,” is made by combining features, or patterns of neuron activations. (The artificial neurons in neural networks mimic the behavior of the neurons in our brains.)
By extracting these neuron activations from Claude 3.0 Sonnet, Anthropic was able to see a map of its internal states as it generates answers. The AI startup found patterns of neuron activations were focused on cities, people, atomic elements, scientific fields and programming syntax, as well as more abstract concepts like bugs in computer code, gender bias at work and conversations about keeping secrets.
In the end, Anthropic said “the internal organization of concepts in the AI model corresponds, at least somewhat, to our human notions of similarity.”
How do large language models learn?
LLMs learn via a process called deep learning.
“It’s a lot like when you teach a child — you show a lot of examples,” said Jason Alan Snyder, global CTO of ad agency Momentum Worldwide.
In other words, you feed the LLM a library of content (what’s known as training data) such as books, articles, code and social media posts to help it understand how words are used in different contexts — and even the more subtle nuances of language.
During this process, the model digests far more than a person could ever read in their lifetime — something on the order of trillions of tokens.
Tokens help AI models break down and process text. You can think of an AI model as a reader who needs help. The model breaks down a sentence into smaller pieces, or tokens — which are equivalent to four characters in English, or about three-quarters of a word — so they can understand each piece and then the overall meaning.
From there, the LLM can analyze how words connect and determine which words often appear together.
“It’s like building this giant map of word relationships,” Snyder said. “And then it starts to be able to do this really fun, cool thing, and it predicts what the next word is … and it compares the prediction to the actual word in the data and adjusts the internal map based on its accuracy.”
This prediction and adjustment happens billions of times, so the LLM is constantly refining its understanding of language and getting better at identifying patterns and predicting future words. It can even learn concepts and facts from the data to answer questions, generate creative text formats and translate languages. But they don’t understand the meaning of words like we do — just the statistical relationships.
LLMs also learn to improve their responses through reinforcement learning from human feedback.
“You get a judgment or a preference from humans on which response was better given the input that it was given,” said Maarten Sap, assistant professor at the Language Technologies Institute at Carnegie Mellon. “And then you can teach the model to improve its responses.”
What do large language models do?
Given a series of input words, a LLM can predict the next word.
For example, consider the phrase, “I went sailing on the deep blue…”
Most people would probably guess “sea” because sailing, deep and blue are all words we associate with the sea. In other words, each word sets up context for what should come next.
“These large language models, because they have a lot of parameters, they can store a lot of patterns,” Riedl said. “They are very good at being able to pick out these clues and make really, really good guesses at what comes next.”
What do large language models do really well?
LLMs are very good at figuring out the connection between words and producing text that sounds natural.
“They take an input, which can often be a set of instructions, like, ‘Do this for me’ or ‘Tell me about this’ or ‘Summarize this’ and are able to extract those patterns out of the input and produce a long string of fluid response,” Riedl said.
Where do large language models struggle?
But they have several weaknesses.
First, they’re not good at telling the truth. In fact, they sometimes just make stuff up that sounds true, like when ChatGPT cited six fake court cases in a legal brief or when Bard mistakenly credited the James Webb Space Telescope with taking the first pictures of a planet outside of our own solar system. Those are known as hallucinations.
“They are extremely unreliable in the sense that they confabulate and make up things a lot,” Sap said. “They’re not trained or designed by any means to spit out anything truthful.”
They also struggle with queries that are fundamentally different from anything they’ve encountered before. That’s because they’re focused on finding and responding to patterns.
A good example is a math problem with a unique set of numbers.
“It may not be able to do that calculation correctly because it’s not really solving math,” Riedl said. “It is trying to relate your math question to previous examples of math questions that it has seen before.”
And while they excel at predicting words, they’re not good at predicting the future, which includes planning and decision making.
“The idea of doing planning in the way that humans do it with … thinking about the different contingencies and alternatives and making choices, this seems to be a really hard roadblock for our current large language models right now,” Riedl said.
Finally, they struggle with current events because their training data typically only goes up to a certain point and anything that happens after that isn’t part of their knowledge base. And because they don’t have the capacity to distinguish between what is factually true and what is likely, they can confidently provide incorrect information about current events.
They also don’t interact with the world the way we do.
“This makes it difficult for them to grasp the nuances and complexities of current events that often require an understanding of context, social dynamics and real-world consequences,” Snyder said.
How will large language models evolve?
We’re already starting to see generative AI companies like OpenAI and Adobe debut multimodal models, which are trained not just on text but on images, video and audio.
We’ll also likely see improvements in LLMs’ abilities to not just translate languages from English but to understand and converse in additional languages as well.
We may also see retrieval capabilities evolve beyond what the models have been trained on. That could include leveraging search engines like Google so the models can conduct web searches and then feed those results into the LLM.
If LLMs were connected to search engines, they could process real-time information far beyond their training data. This means they could better understand queries and provide more accurate, up-to-date responses.
“This helps our linkage models stay current and up to date because they can actually look at new information on the internet and bring that in,” Riedl said.
There are a few catches. Web search could make hallucinations worse without adequate fact-checking mechanisms in place. And LLMs would need to learn how to assess the reliability of web sources before citing them. Plus, it would require a lot of (expensive) computing power to process web search results on demand.
AI-powered Bing, which Microsoft announced in February 2023, is a similar concept. However, instead of tapping into search engines to enhance its responses, Bing is using AI to make its own search engine better. That’s in part by better understanding the true meaning behind consumer queries and better ranking the results for said queries.
Editors’ note: CNET is using an AI engine to help create some stories. For more, see this post.
+ There are no comments
Add yours