Close Menu
    National News Brief
    Wednesday, April 29
    • Home
    • Business
    • Lifestyle
    • Science
    • Technology
    • International
    • Arts & Entertainment
    • Sports
    National News Brief
    Home»Science

    OpenAI’s o3 model aced a test of AI reasoning – but it’s still not AGI

    Team_NationalNewsBriefBy Team_NationalNewsBriefDecember 21, 2024 Science No Comments5 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    OpenAI announced a breakthrough achievement for its new o3 AI model

    Rokas Tenys / Alamy

    OpenAI’s new o3 artificial intelligence model has achieved a breakthrough high score on a prestigious AI reasoning test called the ARC Challenge, inspiring some AI fans to speculate that o3 has achieved artificial general intelligence (AGI). But even as ARC Challenge organisers described o3’s achievement as a major milestone, they also cautioned that it has not won the competition’s grand prize – and it is only one step on the path towards AGI, a term for hypothetical future AI with human-like intelligence.

    The o3 model is the latest in a line of AI releases that follow on from the large language models powering ChatGPT. “This is a surprising and important step-function increase in AI capabilities, showing novel task adaptation ability never seen before in the GPT-family models,” said François Chollet, an engineer at Google and the main creator of the ARC Challenge, in a blog post.

    What did OpenAI’s o3 model actually do?

    Chollet designed the Abstraction and Reasoning Corpus (ARC) Challenge in 2019 to test how well AIs can find correct patterns linking pairs of coloured grids. Such visual puzzles are intended to make AIs demonstrate a form of general intelligence with basic reasoning capabilities. But throwing enough computing power at the puzzles could let even a non-reasoning program simply solve them through brute force. To prevent this, the competition also requires official score submissions to meet certain limits on computing power.

    OpenAI’s newly announced o3 model – which is scheduled for release in early 2025 – achieved its official breakthrough score of 75.7 per cent on the ARC Challenge’s “semi-private” test, which is used for ranking competitors on a public leaderboard. The computing cost of its achievement was approximately $20 for each visual puzzle task, meeting the competition’s limit of less than $10,000 total. However, the harder “private” test that is used to determine grand prize winners has an even more stringent computing power limit, equivalent to spending just 10 cents on each task, which OpenAI did not meet.

    The o3 model also achieved an unofficial score of 87.5 per cent by applying approximately 172 times more computing power than it did on the official score. For comparison, the typical human score is 84 per cent, and an 85 per cent score is enough to win the ARC Challenge’s $600,000 grand prize – if the model can also keep its computing costs within the required limits.

    But to reach its unofficial score, o3’s cost soared to thousands of dollars spent solving each task. OpenAI requested that the challenge organisers not publish the exact computing costs.

    Does this o3 achievement show that AGI has been reached?

    No, the ARC challenge organisers have specifically said they do not consider beating this competition benchmark to be an indicator of having achieved AGI.

    The o3 model also failed to solve more than 100 visual puzzle tasks, even when OpenAI applied a very large amount of computing power toward the unofficial score, said Mike Knoop, an ARC Challenge organiser at software company Zapier, in a social media post on X.

    In a social media post on Bluesky, Melanie Mitchell at the Santa Fe Institute in New Mexico said the following about o3’s progress on the ARC benchmark: “I think solving these tasks by brute-force compute defeats the original purpose”.

    “While the new model is very impressive and represents a big milestone on the way towards AGI, I don’t believe this is AGI – there’s still a fair number of very easy [ARC Challenge] tasks that o3 can’t solve,” said Chollet in another X post.

    However, Chollet described how we might know when human-level intelligence has been demonstrated by some form of AGI. “You’ll know AGI is here when the exercise of creating tasks that are easy for regular humans but hard for AI becomes simply impossible,” he said in the blog post.

    Thomas Dietterich at Oregon State University suggests another way to recognise AGI. “Those architectures claim to include all of the functional components required for human cognition,” he says. “By this measure, the commercial AI systems are missing episodic memory, planning, logical reasoning and, most importantly, meta-cognition.”

    So what does o3’s high score really mean?

    The o3 model’s high score comes as the tech industry and AI researchers have been reckoning with a slower pace of progress in the latest AI models for 2024, compared with the initial explosive developments of 2023.

    Although it did not win the ARC Challenge, o3’s high score indicates that AI models could beat the competition benchmark in the near future. Beyond its unofficial high score, Chollet says many official low-compute submissions have already scored above 81 per cent on the private evaluation test set.

    Dietterich also thinks that “this is a very impressive leap in performance”. However, he cautions that, without knowing more about how OpenAI’s o1 and o3 models work, it is impossible to evaluate just how impressive the high score is. For instance, if o3 was able to practise the ARC problems in advance, then that would make its achievement easier. “We will need to await an open-source replication to understand the full significance of this,” says Dietterich.

    The ARC Challenge organisers are already looking to launch a second and more difficult set of benchmark tests sometime in 2025. They will also keep the ARC Prize 2025 challenge running until someone achieves the grand prize and open-sources their solution.

    Topics:

    • artificial intelligence/
    • AI



    Source link

    Team_NationalNewsBrief
    • Website

    Keep Reading

    Cancer is increasing in young people and we still don’t know why

    Is consciousness more fundamental to reality than quantum physics?

    People are betting on measles outbreaks – and that might be useful

    How worried should you be about an AI apocalypse?

    We may have seen a ‘dirty fireball’ star explosion for the first time

    Multipurpose anti-viral pill may treat colds, norovirus, flu and covid

    Add A Comment

    Comments are closed.

    Editors Picks

    Violinist’s Leap Into Machine Learning at LinkedIn

    June 27, 2025

    OECD calls off antibribery mission to Hungary in unprecedented move | Corruption News

    October 15, 2024

    Canadian Govt To Kill 400 Ostriches To Prevent COVID Research

    November 7, 2025

    Stretching the skin could enable vaccines to be given without a needle

    September 22, 2025

    Mile High Marxist Bernie Sanders Proves There Is No Climate Emergency | The Gateway Pundit

    May 18, 2025
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    About us

    Welcome to National News Brief, your one-stop destination for staying informed on the latest developments from around the globe. Our mission is to provide readers with up-to-the-minute coverage across a wide range of topics, ensuring you never miss out on the stories that matter most.

    At National News Brief, we cover World News, delivering accurate and insightful reports on global events and issues shaping the future. Our Tech News section keeps you informed about cutting-edge technologies, trends in AI, and innovations transforming industries. Stay ahead of the curve with updates on the World Economy, including financial markets, economic policies, and international trade.

    Editors Picks

    Energy War Breaks OPEC: UAE Walks Away As Oil Supply Collapses

    April 29, 2026

    Justin Baldoni Denies Role In Blake Lively’s Career Downfall

    April 29, 2026

    US, allies release joint statement supporting Panama’s sovereignty

    April 29, 2026

    US Senate blocks bid to stop Trump using military against Cuba | Donald Trump News

    April 29, 2026
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Nationalnewsbrief.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.