OpenAI Releases o1, a Model With Reasoning Abilities
- By Paul Mah
- September 25, 2024
OpenAI has released a new model called o1, the first in a series of “reasoning” models it says can answer more complex questions. This was the much-hyped Strawberry model and is released alongside the o1-mini, a smaller, cheaper version.
According to OpenAI, o1 significantly outperforms GPT-4o on the vast majority of these reasoning-heavy tasks. Indeed, it ranks in the 89th percentile on competitive programming questions (Codeforces) and exceeds human PhD-level accuracy on a benchmark of physics, biology, and chemistry problems (GPQA).
Works differently
It is worth noting that o1 is currently a single-model AI that is capable of only accepting inputs using a text interface. And unlike other AI models that respond immediately, it also takes up to 30 seconds to “think” before coming up with a response.
This mimics how a human may think for a long time before responding to a difficult question. Under the hood, o1 utilizes an improved chain of thought process that allows it to break down tricky steps into simpler ones, including recognizing its mistakes and trying a different approach when one isn’t working.
In a dramatic example of reasoning given by OpenAI in its blog post, o1 successfully decoded a ciphertext using just one example. In the same test, GPT-4o stumbled and eventually gave up, asking for additional decoding rules for the cipher.
But while the o1 model hallucinates less, the problem still persists, according to OpenAI research lead, Jerry Tworek, who said: “We can’t say we solved hallucinations.”
A lot more expensive
The new o1 and o1-mini models are currently available to ChatGPT Plus and Team users. It’s worth noting that developer access to o1 is far more expensive – around three and four times more.
Specifically, o1 is USD 15 per 1 million input tokens and USD 60 per 1 million output tokens. In comparison, GPT-4o costs just USD 5 per 1 million input and USD 15 per 1 million output tokens.
OpenAI says it plans to bring o1-mini access to all the free users of ChatGPT eventually.
A thinking AI model
According to OpenAI, it trained the o1 models using a completely new optimization algorithm and a new training dataset specifically tailored for it, though details are scant.
In addition, OpenAI is also keeping the precise thinking process of its o1 model under wraps. They claim it might be unsafe to show users the precise thought processes, so this is piped through another model to generate a summary of it. Of course, detractors will argue OpenAI is likely also concerned about the process being leaked.
For now, there are ethical questions. For one, responses from OpenAI suggest that o1 might generate unsavory content as part of its thought process. Also, OpenAI gives o1 a “medium” rating for nuclear, biological, and chemical weapon risk, and says that they won’t release a model with a higher than medium risk.
But if it is already bumping up against the limits on risk, can it develop the next model without breaching this limit?
For now, OpenAI plans to release improved versions of the o1 model. It also expects these new reasoning capabilities to improve our ability to align models to human values and principles to eventually unlock many new use cases for AI in science, coding, math, and related fields.
Image credit: iStock/Sasiistock
Paul Mah
Paul Mah is the editor of DSAITrends, where he report on the latest developments in data science and AI. A former system administrator, programmer, and IT lecturer, he enjoys writing both code and prose.