Close Menu
    National News Brief
    Monday, January 12
    • Home
    • Business
    • Lifestyle
    • Science
    • Technology
    • International
    • Arts & Entertainment
    • Sports
    National News Brief
    Home»Technology

    AI Training:Newest Google and Nvidia Chips Speed AI Training

    Team_NationalNewsBriefBy Team_NationalNewsBriefNovember 13, 2024 Technology No Comments5 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email



    Nvidia
    , Oracle, Google, Dell and 13 other companies reported how long it takes their computers to train the key neural networks in use today. Among those results were the first glimpse of Nvidia’s next generation GPU, the B200, and Google’s upcoming accelerator, called Trillium. The B200 posted a doubling of performance on some tests versus today’s workhorse Nvidia chip, the H100. And Trillium delivered nearly a four-fold boost over the chip Google tested in 2023.

    The benchmark tests, called MLPerf v4.1, consist of six tasks: recommendation, the pre-training of the large language models (LLM) GPT-3 and BERT-large, the fine tuning of the Llama 2 70B large language model, object detection, graph node classification, and image generation.

    Training GPT-3 is such a mammoth task that it’d be impractical to do the whole thing just to deliver a benchmark. Instead, the test is to train it to a point that experts have determined means it is likely to reach the goal if you kept going. For Llama 2 70B, the goal is not to train the LLM from scratch, but to take an already trained model and fine-tune it so it’s specialized in a particular expertise—in this case,government documents. Graph node classification is a type of machine learning used in fraud detection and drug discovery.

    As what’s important in AI has evolved, mostly toward using generative AI, the set of tests has changed. This latest version of MLPerf marks a complete changeover in what’s being tested since the benchmark effort began. “At this point all of the original benchmarks have been phased out,” says David Kanter, who leads the benchmark effort at MLCommons. In the previous round it was taking mere seconds to perform some of the benchmarks.

    Performance of the best machine learning systems on various benchmarks has outpaced what would be expected if gains were solely from Moore’s Law [blue line]. Solid line represent current benchmarks. Dashed lines represent benchmarks that have now been retired, because they are no longer industrially relevant.MLCommons

    According to MLPerf’s calculations, AI training on the new suite of benchmarks is improving at about twice the rate one would expect from Moore’s Law. As the years have gone on, results have plateaued more quickly than they did at the start of MLPerf’s reign. Kanter attributes this mostly to the fact that companies have figured out how to do the benchmark tests on very large systems. Over time, Nvidia, Google, and others have developed software and network technology that allows for near linear scaling—doubling the processors cuts training time roughly in half.

    First Nvidia Blackwell training results

    This round marked the first training tests for Nvidia’s next GPU architecture, called Blackwell. For the GPT-3 training and LLM fine-tuning, the Blackwell (B200) roughly doubled the performance of the H100 on a per-GPU basis. The gains were a little less robust but still substantial for recommender systems and image generation—64 percent and 62 percent, respectively.

    The Blackwell architecture, embodied in the Nvidia B200 GPU, continues an ongoing trend toward using less and less precise numbers to speed up AI. For certain parts of transformer neural networks such as ChatGPT, Llama2, and Stable Diffusion, the Nvidia H100 and H200 use 8-bit floating point numbers. The B200 brings that down to just 4 bits.

    Google debuts 6th gen hardware

    Google showed the first results for its 6th generation of TPU, called Trillium—which it unveiled only last month—and a second round of results for its 5th generation variant, the Cloud TPU v5p. In the 2023 edition, the search giant entered a different variant of the 5th generation TPU, v5e, designed more for efficiency than performance. Versus the latter, Trillium delivers as much as a 3.8-fold performance boost on the GPT-3 training task.

    But versus everyone’s arch-rival Nvidia, things weren’t as rosy. A system made up of 6,144 TPU v5ps reached the GPT-3 training checkpoint in 11.77 minutes, placing a distant second to an 11,616-Nvidia H100 system, which accomplished the task in about 3.44 minutes. That top TPU system was only about 25 seconds faster than an H100 computer half its size.

    A Dell Technologies computer fine-tuned the Llama 2 70B large language model using about 75 cents worth of electricity.

    In the closest head-to-head comparison between v5p and Trillium, with each system made up of 2048 TPUs, the upcoming Trillium shaved a solid 2 minutes off of the GPT-3 training time, nearly an 8 percent improvement on v5p’s 29.6 minutes. Another difference between the Trillium and v5p entries is that Trillium is paired with AMD Epyc CPUs instead of the v5p’s Intel Xeons.

    Google also trained the image generator, Stable Diffusion, with the Cloud TPU v5p. At 2.6 billion parameters, Stable Diffusion is a light enough lift that MLPerf contestants are asked to train it to convergence instead of just to a checkpoint, as with GPT-3. A 1024 TPU system ranked second, finishing the job in 2 minutes 26 seconds, about a minute behind the same size system made up of Nvidia H100s.

    Training power is still opaque

    The steep energy cost of training neural networks has long been a source of concern. MLPerf is only beginning to measure this. Dell Technologies was the sole entrant in the energy category, with an eight-server system containing 64 Nvidia H100 GPUs and 16 Intel Xeon Platinum CPUs. The only measurement made was in the LLM fine-tuning task (Llama2 70B). The system consumed 16.4 megajoules during its 5-minute run, for an average power of 5.4 kilowatts. That means about 75 cents of electricity at the average cost in the United States.

    While it doesn’t say much on its own, the result does potentially provide a ballpark for the power consumption of similar systems. Oracle, for example, reported a close performance result—4 minutes 45 seconds—using the same number and types of CPUs and GPUs.

    From Your Site Articles

    Related Articles Around the Web



    Source link

    Team_NationalNewsBrief
    • Website

    Keep Reading

    Ofcom investigates Elon Musk’s X over Grok AI sexual deepfakes

    New Amplifiers Boost Atacama Large Millimeter Array

    Malaysia and Indonesia block X chatbot over sexually explicit deepfakes

    AI Coding Degrades: Silent Failures Emerge

    Quanscient MultiphysicsAI for PMUT design

    Robot Videos: Atlas Humanoid, CES 2026 Bots , and More

    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    Tesla robotaxi service rolls out in ‘low-key’ Texas launch

    June 23, 2025

    How to Deal With Negative Articles on Google

    July 5, 2025

    McLaurin may have gained millions in leverage after preseason rout

    August 9, 2025

    The strange geoengineering idea with potential for significant fallout

    January 31, 2025

    Meri Brown Earns The ‘Favorite Ex-Wife’ Title From Kody

    November 24, 2024
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    About us

    Welcome to National News Brief, your one-stop destination for staying informed on the latest developments from around the globe. Our mission is to provide readers with up-to-the-minute coverage across a wide range of topics, ensuring you never miss out on the stories that matter most.

    At National News Brief, we cover World News, delivering accurate and insightful reports on global events and issues shaping the future. Our Tech News section keeps you informed about cutting-edge technologies, trends in AI, and innovations transforming industries. Stay ahead of the curve with updates on the World Economy, including financial markets, economic policies, and international trade.

    Editors Picks

    The Muslim Brotherhood Has Infiltrated UK Universities

    January 12, 2026

    Amanda Seyfried’s Viral Golden Globes Reactions Explained

    January 12, 2026

    NATO says working on ‘next steps’ to boost Arctic security

    January 12, 2026

    What we know about the protests sweeping Iran | Business and Economy News

    January 12, 2026
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Nationalnewsbrief.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.