Close Menu
    National News Brief
    Friday, June 12
    • Home
    • Business
    • Lifestyle
    • Science
    • Technology
    • International
    • Arts & Entertainment
    • Sports
    National News Brief
    Home » AI chatbots fail to diagnose patients by talking with them

    AI chatbots fail to diagnose patients by talking with them

    Team_NationalNewsBriefBy Team_NationalNewsBriefJanuary 2, 2025 Science No Comments3 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Don’t call your favourite AI “doctor” just yet

    Just_Super/Getty Images

    Advanced artificial intelligence models score well on professional medical exams but still flunk one of the most crucial physician tasks: talking with patients to gather relevant medical information and deliver an accurate diagnosis.

    “While large language models show impressive results on multiple-choice tests, their accuracy drops significantly in dynamic conversations,” says Pranav Rajpurkar at Harvard University. “The models particularly struggle with open-ended diagnostic reasoning.”

    That became evident when researchers developed a method for evaluating a clinical AI model’s reasoning capabilities based on simulated doctor-patient conversations. The “patients” were based on 2000 medical cases primarily drawn from professional US medical board exams.

    “Simulating patient interactions enables the evaluation of medical history-taking skills, a critical component of clinical practice that cannot be assessed using case vignettes,” says Shreya Johri, also at Harvard University. The new evaluation benchmark, called CRAFT-MD, also “mirrors real-life scenarios, where patients may not know which details are crucial to share and may only disclose important information when prompted by specific questions”, she says.

    The CRAFT-MD benchmark itself relies on AI. OpenAI’s GPT-4 model played the role of a “patient AI” in conversation with the “clinical AI” being tested. GPT-4 also helped grade the results by comparing the clinical AI’s diagnosis with the correct answer for each case. Human medical experts double-checked these evaluations. They also reviewed the conversations to check the patient AI’s accuracy and see if the clinical AI managed to gather the relevant medical information.

    Multiple experiments showed that four leading large language models – OpenAI’s GPT-3.5 and GPT-4 models, Meta’s Llama-2-7b model and Mistral AI’s Mistral-v2-7b model – performed considerably worse on the conversation-based benchmark than they did when making diagnoses based on written summaries of the cases. OpenAI, Meta and Mistral AI did not respond to requests for comment.

    For example, GPT-4’s diagnostic accuracy was an impressive 82 per cent when it was presented with structured case summaries and allowed to select the diagnosis from a multiple-choice list of answers, falling to just under 49 per cent when it did not have the multiple-choice options. When it had to make diagnoses from simulated patient conversations, however, its accuracy dropped to just 26 per cent.

    And GPT-4 was the best-performing AI model tested in the study, with GPT-3.5 often coming in second, the Mistral AI model sometimes coming in second or third and Meta’s Llama model generally scoring lowest.

    The AI models also failed to gather complete medical histories a significant proportion of the time, with leading model GPT-4 only doing so in 71 per cent of simulated patient conversations. Even when the AI models did gather a patient’s relevant medical history, they did not always produce the correct diagnoses.

    Such simulated patient conversations represent a “far more useful” way to evaluate AI clinical reasoning capabilities than medical exams, says Eric Topol at the Scripps Research Translational Institute in California.

    If an AI model eventually passes this benchmark, consistently making accurate diagnoses based on simulated patient conversations, this would not necessarily make it superior to human physicians, says Rajpurkar. He points out that medical practice in the real world is “messier” than in simulations. It involves managing multiple patients, coordinating with healthcare teams, performing physical exams and understanding “complex social and systemic factors” in local healthcare situations.

    “Strong performance on our benchmark would suggest AI could be a powerful tool for supporting clinical work – but not necessarily a replacement for the holistic judgement of experienced physicians,” says Rajpurkar.

    Topics:



    Source link

    Team_NationalNewsBrief
    • Website

    Keep Reading

    El Niño has started and the weather could get weird

    Have we finally worked out how Venus flytraps snap shut?

    Global map reveals the vast scale of underground fungal networks

    A nuclear war between India and Pakistan could destroy the ozone layer

    New Scientist recommends Steve Brusatte’s brilliant take on the evolution of birds

    The U.S. is getting hit with severe stormy weather—here’s what’s stewing in the atmosphere

    Add A Comment

    Comments are closed.

    Editors Picks

    Anthropic AI goes rogue when trying to run a vending machine

    July 27, 2025

    Angels sign one-time All-Star starting pitcher to three-year deal

    November 25, 2024

    Stephen Miller Says Administration Working to Dismantle ‘Domestic Terror Movement’

    September 18, 2025

    Washington Monument lit up for America’s 250th anniversary

    January 1, 2026

    Robot Videos: Unitree’s Quadruped, Meta AI’s Finger, and More

    November 8, 2024
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    About us

    Welcome to National News Brief, your one-stop destination for staying informed on the latest developments from around the globe. Our mission is to provide readers with up-to-the-minute coverage across a wide range of topics, ensuring you never miss out on the stories that matter most.

    At National News Brief, we cover World News, delivering accurate and insightful reports on global events and issues shaping the future. Our Tech News section keeps you informed about cutting-edge technologies, trends in AI, and innovations transforming industries. Stay ahead of the curve with updates on the World Economy, including financial markets, economic policies, and international trade.

    Editors Picks

    Wholesale Inflation Confirms Energy Crisis

    June 12, 2026

    Teresa Giudice’s Daughter, Milania, Arrested After Violent Incident

    June 12, 2026

    ‘Kids know Lamine as well as they know LeBron’: World Cup excitement builds in the US

    June 12, 2026

    World Cup day 2: USA, Canada begin – schedule, predictions and how to watch | World Cup 2026 News

    June 12, 2026
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Nationalnewsbrief.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.