OpenAI Releases Its Highly Anticipated GPT-o1 Model

OpenAI today released a preview of its next-generation large language models, which the company says perform better than its previous models but come with a few caveats.

In its announcement for the new model, o1-preview, OpenAI touted its performance on a variety of tasks designed for humans. The model scored in the 89th percentile in programming competitions held by Codeforces and answered 83 percent of questions on a qualifying test for the International Mathematics Olympiad, compared to GPT-4o’s 14 percent correct.

Sam Altman, OpenAI’s CEO, said the o1-preview and o1-mini models were the “beginning of a new paradigm: AI that can do general-purpose complex reasoning.” But he added that “o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it.”

When asked a question, the new models use chain-of-thought techniques that mimic how humans think and how many generative AI users have learned to use the technology—by continuously prompting and correcting the model with new directions until it achieves the desired answer. But in o1 models, versions of those processes happen behind the scenes without additional prompting. “It learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn’t working,” the company said.

While these techniques improve the models’ performances on various benchmarks, OpenAI found that in a small subset of cases, they also result in o1 models intentionally deceiving users. In a test of 100,000 ChatGPT conversations powered by o1-preview, the company found that about 800 answers the model supplied were incorrect. And for roughly a third of those incorrect responses, the model’s chain of thought showed that it knew the answer was incorrect but provided it anyway.

“Intentional hallucinations primarily happen when o1-preview is asked to provide references to articles, websites, books, or similar sources that it cannot easily verify without access to internet search, causing o1-preview to make up plausible examples instead,” the company wrote in its model system card.

Overall, the new models performed better than GPT-4o, OpenAI’s previous state-of-the-art model, on various company safety benchmarks measuring how easily the models can be jailbroken, how often they provide incorrect responses, and how often they display bias regarding age, gender, and race. However, the company found that o1-preview was significantly more likely than GPT-4o to provide an answer when it was asked an ambiguous question where the model should have responded that it didn’t know the answer.

OpenAI did not release much information about the data used to train its new models, saying only that they were trained on a combination of publicly available data and proprietary data obtained through partnerships.

Source link

Breaking News

How to get an Apple TV Plus free trial

Your most dramatic human friend can’t hold a candle to the AI ‘Friends’ you can soon wear around your neck

An Assassin’s Creed Game From Nearly A Decade Ago Is Getting A 60fps Patch

Ginger-Orange Upside-Down Tarts Recipe

Top US Cybersecurity Agency chief set to depart before Trump takes over

Samsung’s new smartphone zoom breakthrough promises to boost your low-light portrait shots

These are the 10 best Android apps of 2024 – according to Google

F1 Academy 2025: Dates and schedule confirmed for next season with Las Vegas joining calendar | F1 News

Baldur’s Gate 3 Mods Help Player Numbers Grow, And Someone In The Vatican Just Bought A Copy

How to get an Apple TV Plus free trial

Your most dramatic human friend can’t hold a candle to the AI ‘Friends’ you can soon wear around your neck

An Assassin’s Creed Game From Nearly A Decade Ago Is Getting A 60fps Patch

Ginger-Orange Upside-Down Tarts Recipe

OpenAI Releases Its Highly Anticipated GPT-o1 Model

More From Author

How to get an Apple TV Plus free trial

Your most dramatic human friend can’t hold a candle to the AI ‘Friends’ you can soon wear around your neck

An Assassin’s Creed Game From Nearly A Decade Ago Is Getting A 60fps Patch

+ There are no comments

Cancel reply

Crypto Mining Company Denies Causing Health Problems in Texas Town

Assassin’s Creed Shadows dev says the open-world game aims to be “respectful as possible to Japanese culture, but creative choices are made on our side”

You May Also Like:

How to get an Apple TV Plus free trial

Your most dramatic human friend can’t hold a candle to the AI ‘Friends’ you can soon wear around your neck

An Assassin’s Creed Game From Nearly A Decade Ago Is Getting A 60fps Patch

Ginger-Orange Upside-Down Tarts Recipe

Top US Cybersecurity Agency chief set to depart before Trump takes over

Samsung’s new smartphone zoom breakthrough promises to boost your low-light portrait shots

These are the 10 best Android apps of 2024 – according to Google

F1 Academy 2025: Dates and schedule confirmed for next season with Las Vegas joining calendar | F1 News

Breaking News

Top Tagged

+ There are no comments

Crypto Mining Company Denies Causing Health Problems in Texas Town

Assassin’s Creed Shadows dev says the open-world game aims to be “respectful as possible to Japanese culture, but creative choices are made on our side”