Close Menu
    National News Brief
    Wednesday, October 15
    • Home
    • Business
    • Lifestyle
    • Science
    • Technology
    • International
    • Arts & Entertainment
    • Sports
    National News Brief
    Home»Technology

    Small Language Models: Edge AI Innovation From AI21

    Team_NationalNewsBriefBy Team_NationalNewsBriefOctober 14, 2025 Technology No Comments4 Mins Read
    Share
    Facebook Twitter LinkedIn Pinterest Email

    While most of the AI world is racing to build ever-bigger language models like OpenAI’s GPT-5 and Anthropic’s Claude Sonnet 4.5, the Israeli AI startup AI21 is taking a different path.

    AI21 has just unveiled Jamba Reasoning 3B, a 3-billion-parameter model. This compact, open-source model can handle massive context windows of 250,000 tokens (meaning that it can “remember” and reason over much more text than typical language models) and can run at high speed, even on consumer devices. The launch highlights a growing shift: smaller, more efficient models could shape the future of AI just as much as raw scale.

    “We believe in a more decentralized future for AI—one where not everything runs in massive data centers,” says Ori Goshen, Co-CEO of AI21, in an interview with IEEE Spectrum. “Large models will still play a role, but small, powerful models running on devices will have a significant impact” on both the future and the economics of AI, he says. Jamba is built for developers who want to create edge-AI applications and specialized systems that run efficiently on-device.

    AI21’s Jamba Reasoning 3B is designed to handle long sequences of text and challenging tasks like math, coding, and logical reasoning—all while running with impressive speed on everyday devices like laptops and mobile phones. Jamba Reasoning 3B can also work in a hybrid setup: Simple jobs are handled locally by the device, while heavier problems get sent to powerful cloud servers. According to AI21, this smarter routing could dramatically cut AI infrastructure costs for certain workloads—potentially by an order of magnitude.

    A Small but Mighty LLM

    With 3 billion parameters, Jamba Reasoning 3B is tiny by today’s AI standards. Models like GPT-5 or Claude run well past 100 billion parameters, and even smaller models, such as Llama 3 (8B) or Mistral (7B), are more than twice the size of AI21’s model, Goshen notes.

    That compact size makes it more remarkable that AI21’s model can handle a context window of 250,000 tokens on consumer devices. Some proprietary models, like GPT-5, offer even longer context windows, but Jamba sets a new high-water mark among open-source models. The previous open-model record of 128,000 tokens was held by Meta’s Llama 3.2 (3B), Microsoft’s Phi-4 Mini, and DeepSeek R1, which are all much larger models. Jamba Reasoning 3B can process more than 17 tokens per second even when working at full capacity—that is, with extremely long inputs that use its full 250,000-token context window. Many other models slow down or struggle once their input length exceeds 100,000 tokens.

    Goshen explains that the model is built on an architecture called Jamba, which combines two types of neural network designs: transformer layers, familiar from other large language models, and Mamba layers, which are designed to be more memory-efficient. This hybrid design enables the model to handle long documents, large codebases, and other extensive inputs directly on a laptop or phone—using about one-tenth the memory of traditional transformers. Goshen says the model runs much faster than traditional transformers because it relies less on a memory component called the KV cache, which can slow down processing as inputs get longer.

    Why Small LLMs Are Needed

    The model’s hybrid architecture gives it an advantage in both speed and memory efficiency, even with very long inputs, confirms a software engineer who works in the LLM industry. The engineer requested anonymity because they’re not authorized to comment on other companies’ models. As more users run generative AI locally on laptops, models need to handle long context lengths quickly without consuming too much memory. At 3 billion parameters, Jamba meets these requirements, says the engineer, making it a model that’s optimized for on-device use.

    Jamba Reasoning 3B is open source under the permissive Apache 2.0 license and available on popular platforms such as Hugging Face and LM Studio. The release also comes with instructions for fine-tuning the model through an open-source reinforcement-learning platform (called VERL), making it easier and more affordable for developers to adapt the model for their own tasks.

    “Jamba Reasoning 3B marks the beginning of a family of small, efficient reasoning models,” Goshen said. “Scaling down enables decentralization, personalization, and cost efficiency. Instead of relying on expensive GPUs in data centers, individuals and enterprises can run their own models on devices. That unlocks new economics and broader accessibility.”

    From Your Site Articles

    Related Articles Around the Web



    Source link

    Team_NationalNewsBrief
    • Website

    Keep Reading

    iRobot Roomba History: How a Focus Group Changed It

    Capita fined £14m for cyber-attack which affected millions

    Bitcoin worth $14bn seized in US-UK crackdown on alleged scammers

    OpenAI boss says ChatGPT will soon allow erotica for verified adults

    Artificial Neurons Bridge Bio-Electronic Gap

    Have plans on paper in case of cyber-attack, firms told

    Add A Comment

    Comments are closed.

    Editors Picks

    Sources Say Bravo Will Continue ‘RHOP’ Without Karen Huger

    March 2, 2025

    Alice Evans Receives $5K In Fan Donations After Cry For Help

    May 16, 2025

    Trump Receives Endorsements from Muslim, Arab Leaders at Michigan Rally

    October 28, 2024

    ‘The Paper’ creators on saving local news, without getting preachy

    September 18, 2025

    Rams reportedly make Matthew Stafford intentions clear amid trade talk

    February 19, 2025
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    About us

    Welcome to National News Brief, your one-stop destination for staying informed on the latest developments from around the globe. Our mission is to provide readers with up-to-the-minute coverage across a wide range of topics, ensuring you never miss out on the stories that matter most.

    At National News Brief, we cover World News, delivering accurate and insightful reports on global events and issues shaping the future. Our Tech News section keeps you informed about cutting-edge technologies, trends in AI, and innovations transforming industries. Stay ahead of the curve with updates on the World Economy, including financial markets, economic policies, and international trade.

    Editors Picks

    Is The US Bailing Out Argentina?

    October 15, 2025

    Outrageous! Two Youths Who Brutally Assaulted Ex-DOGE Employee Edward “Big Balls” Coristine Get Slap on Wrist and Avoid Jail | The Gateway Pundit

    October 15, 2025

    Leaked Footage Exposes Truth Behind Alec Baldwin Crash

    October 15, 2025

    Trump is a ‘go’ on meeting with China’s Xi, Bessent tells CNBC

    October 15, 2025
    Categories
    • Arts & Entertainment
    • Business
    • International
    • Latest News
    • Lifestyle
    • Opinions
    • Politics
    • Science
    • Sports
    • Technology
    • Top Stories
    • Trending News
    • World Economy
    • Privacy Policy
    • Disclaimer
    • Terms and Conditions
    • About us
    • Contact us
    Copyright © 2024 Nationalnewsbrief.com All Rights Reserved.

    Type above and press Enter to search. Press Esc to cancel.