Nvidia has struck what it describes as its largest transaction ever, agreeing a $20 billion licensing deal with artificial intelligence chip startup Groq that brings founder Jonathan Ross, president Sunny Madra, and Groq’s core IP into Nvidia, while GroqCloud continues to operate independently under new CEO Simon Edwards. Groq has publicly framed the move as a “non-exclusive licensing agreement,” but an internal email from Nvidia CEO Jensen Huang describes plans to integrate Groq’s low-latency processors into the Nvidia artificial intelligence factory architecture to support a broader range of inference and real-time workloads. The deal nearly triples Nvidia’s $7 billion Mellanox acquisition in 2019 and is framed as a response to inference workloads splitting into distinct prefill and decode phases that benefit from different hardware characteristics.
The article explains that Groq’s LPUs rely on fast on-chip SRAM instead of fetching model weights from off-chip HBM, which avoids memory bandwidth bottlenecks and accelerates the decode phase of inference, albeit with cost and capacity tradeoffs. It notes that Cerebras hits 2,600 tokens per second on Llama 4 Scout, while the fastest GPU solutions are 137 tokens per second, and cites Bank of America analyst Vivek Arya, who argues the deal implies Nvidia recognizes that while GPUs dominated artificial intelligence training, the rapid shift to inference may require more specialized chips, potentially combining GPUs and LPUs within the same rack via NVLink. The roadmap described in the piece imagines three Rubin variants optimized for different workloads, including a Groq-derived “Rubin SRAM” for ultra-low latency agentic reasoning, and suggests this trajectory will pressure most ASIC efforts except Google’s TPUs, Apple’s artificial intelligence chips, and AWS Trainium, with Intel’s reported $1.6 billion SambaNova acquisition and Meta’s purchase of Rivos reinforcing a consolidation trend that leaves Cerebras as the last independent SRAM-focused inference vendor.
On the software and services side, the author argues that Google’s distribution advantages are intensifying competition with OpenAI, pointing to Apple’s list of top free iPhone apps for 2025 where Google claimed five of the top 10 positions, including Google Search (#3), YouTube (#7), Google Maps (#8), Gmail (#9), and Gemini (#10), while OpenAI’s ChatGPT appears only once and Meta holds three slots. The piece contends that despite a head start, OpenAI is now under pressure on all fronts and is distracted by enterprise pursuits where Anthropic is described as already winning, while Google benefits from defaults such as Gemini shipping on every Android device, native integration with iOS, and direct embedding in Chrome’s 3.5 billion users, in contrast to ChatGPT’s reliance on users seeking out an app or website. New survey data from Menlo Ventures is cited to show open source artificial intelligence models losing share more quickly than expected, with production usage falling from 19% to 13% in the past six months, 51% of enterprises and 77% of startups not using open source models at all, and only 11% of enterprises and 4% of startups running open source for more than half their artificial intelligence workloads, while LMArena Elo scores are described as showing closed-source systems consistently outperforming open source since March 2023.
The article extends this stagnation narrative to the tooling ecosystem, recalling that OpenAI’s $3 billion acquisition offer for coding assistant Windsurf highlighted the value of artificial intelligence developer tools and citing examples such as GPT Engineer evolving into the paid product Lovable and Continue’s “continuedev” project gaining traction before larger vendors moved in. The pattern described is one where open source projects validate a market only to see commercial platforms out-invest and out-distribute them, leading to acquisitions or the erosion of the original communities. The newly formed Agentic AI Foundation under the Linux Foundation, whose members include Anthropic, OpenAI, Google, Microsoft, and AWS, is presented as a potential accelerant of this process: it aims to define open standards for artificial intelligence agents but also ensures the largest cloud and model providers have significant influence over how “open” is defined for the next wave of agentic systems.
