Close Menu
    National News Brief
    Sunday, January 18
    • Home
    • Business
    • Lifestyle
    • Science
    • Technology
    • International
    • Arts & Entertainment
    • Sports
    National News Brief
    Home»Technology

    Analog AI Startup Aims to Lower the Power of Gen AI

    Team_NationalNewsBriefBy Team_NationalNewsBriefNovember 23, 2024 Technology No Comments5 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email


    Machine learning chips that use analog circuits instead of digital ones have long promised huge energy savings. But in practice they’ve mostly delivered modest savings, and only for modest-sized neural networks. Silicon Valley startup Sageance says it has the technology to bring the promised power savings to tasks suited for massive generative AI models. The startup claims that its systems will be able to run the large language model Llama 2-70B at one-tenth the power of an Nvidia H100 GPU-based system, at one-twentieth the cost and in one-twentieth the space.

    “My vision was to create a technology that was very differentiated from what was being done for AI,” says Sageance CEO and founder Vishal Sarin. Even back when the company was founded in 2018, he “realized power consumption would be a key impediment to the mass adoption of AI…. The problem has become many, many orders of magnitude worse as generative AI has caused the models to balloon in size.”

    The core power-savings prowess for analog AI comes from two fundamental advantages: It doesn’t have to move data around and it uses some basic physics to do machine learning’s most important math.

    That math problem is multiplying vectors and then adding up the result, called multiply and accumulate.Early on, engineers realized that two foundational rules of electrical engineers did the same thing, more or less instantly. Ohm’s Law—voltage multiplied by conductance equals current—does the multiplication if you use the neural network’s “weight” parameters as the conductances. Kirchoff’s Current Law—the sum of the currents entering and exiting a point is zero—means you can easily add up all those multiplications just by connecting them to the same wire. And finally, in analog AI, the neural network parameters don’t need to be moved from memory to the computing circuits—usually a bigger energy cost than computing itself—because they are already embedded within the computing circuits.

    Sageance uses flash memory cells as the conductance values. The kind of flash cell typically used in data storage is a single transistor that can hold 3 or 4 bits, but Sageance has developed algorithms that let cells embedded in their chips hold 8 bits, which is the key level of precision for LLMs and other so-called transformer models. Storing an 8-bit number in a single transistor instead of the 48 transistors it would take in a typical digital memory cell is an important cost, area, and energy savings, says Sarin, who has been working on storing multiple bits in flash for 30 years.

    Digital data is converted to analog voltages [left]. These are effectively multiplied by flash memory cells [blue], summed, and converted back to digital data [bottom].Analog Inference

    Adding to the power savings is that the flash cells are operated in a state called “deep subthreshold.” That is, they are working in a state where they are barely on at all, producing very little current. That wouldn’t do in a digital circuit, because it would slow computation to a crawl. But because the analog computation is done all at once, it doesn’t hinder the speed.

    Analog AI Issues

    If all this sounds vaguely familiar, it should. Back in 2018 a trio of startups went after a version of flash-based analog AI. Syntiant eventually abandoned the analog approach for a digital scheme that’s put six chips in mass production so far. Mythic struggled but stuck with it, as has Anaflash. Others, particularly IBM Research, have developed chips that rely on nonvolatile memories other than flash, such as phase-change memory or resistive RAM.

    Generally, analog AI has struggled to meet its potential, particularly when scaled up to a size that might be useful in datacenters. Among its main difficulties are the natural variation in the conductance cells; that might mean the same number stored in two different cells will result in two different conductances. Worse still, these conductances can drift over time and shift with temperature. This noise drowns out the signal representing the result, and the noise can be compounded stage after stage through the many layers of a deep neural network.

    Sageance’s solution, Sarin explains, is a set of reference cells on the chip and a proprietary algorithm that uses them to calibrate the other cells and track temperature-related changes.

    Another source of frustration for those developing analog AI has been the need to digitize the result of the multiply and accumulate process in order to deliver it to the next layer of the neural network where it must then be turned back into an analog voltage signal. Each of those steps requires analog-to-digital and digital-to-analog converters, which take up area on the chip and soak up power.

    According to Sarin, Sageance has developed low-power versions of both circuits. The power demands of the digital-to-analog converter are helped by the fact that the circuit needs to deliver a very narrow range of voltages in order to operate the flash memory in deep subthreshold mode.

    Systems and What’s Next

    Sageance’s first product, to launch in 2025, will be geared toward vision systems, which are a considerably lighter lift than server-based LLMs. “That is a leapfrog product for us, to be followed very quickly [by] generative AI,” says Sarin.

    Rectangles of various size and texture arranged atop a long narrow rectangle.Future systems from Sageance will be made up of 3D-stacked analog chips linked to a processor and memory through an interposer that follows the universal chiplet interconnect (UCIe) standard.Analog Inference

    The generative AI product would be scaled up from the vision chip mainly by vertically stacking analog AI chiplets atop a communications die. These stacks would be linked to a CPU die and to high-bandwidth memory DRAM in a single package called Delphi.

    In simulations, a system made up of Delphis would run Llama2-70B at 666,000 tokens per second consuming 59 kilowatts, versus a 624 kW for an Nvidia H100-based system, Sageance claims.

    From Your Site Articles

    Related Articles Around the Web



    Source link

    Team_NationalNewsBrief
    • Website

    Keep Reading

    Wireless Power Beamed From Moving Aircraft

    Transmission Line Safety Suit Saves Line Workers’ Lives

    Robot Videos: Bipedal Robot, Social Bots, and More

    AI Data Centers Face Skilled Worker Shortage

    2026 IEEE Medal of Honor Goes to Nvidia’s Jensen Huang

    Use film-style age ratings to limit teens’ social media, say Lib Dems

    Add A Comment

    Comments are closed.

    Editors Picks

    There is something worse than starvation in Gaza | Israel-Palestine conflict

    August 24, 2025

    Figure skater Maxim Naumov makes U.S. Olympic team a year after parents were killed in plane crash

    January 11, 2026

    Opinion | ‘I Don’t Get to Draw the Line’

    September 30, 2025

    Can Biden Declare Martial Law To Suspend The Election?

    October 21, 2024

    A victim of Gov. Ferguson’s veto: Kent center for babies exposed to drugs

    June 29, 2025
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    About us

    Welcome to National News Brief, your one-stop destination for staying informed on the latest developments from around the globe. Our mission is to provide readers with up-to-the-minute coverage across a wide range of topics, ensuring you never miss out on the stories that matter most.

    At National News Brief, we cover World News, delivering accurate and insightful reports on global events and issues shaping the future. Our Tech News section keeps you informed about cutting-edge technologies, trends in AI, and innovations transforming industries. Stay ahead of the curve with updates on the World Economy, including financial markets, economic policies, and international trade.

    Editors Picks

    Kill the Messenger to Prevent Political Change?

    January 18, 2026

    Prince William Wants His ‘Wayward’ Uncle Andrew ‘As Far Away As Possible’

    January 18, 2026

    Syrian troops sweep northern towns as Kurdish fighters withdraw

    January 18, 2026

    US-backed Palestinian committee shares mission statement on Gaza governance | Gaza News

    January 18, 2026
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Nationalnewsbrief.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.