ITEL’s VibeStudio achieves a big Artificial Intelligence breakthrough: world-best LLM performance on just one GPU

VibeStudio, incubated by Immersive Technology and Entrepreneurship Labs (ITEL), reports a 55% pruning of the MiniMax M2 model using THRIFT, delivering near state-of-the-art LLM reasoning and coding on far smaller hardware. The team says the pruned model is open-source on HuggingFace and the company maintains two private models for enterprise use.

Chennai, 26th November 2025. VibeStudio, an agentic coding suite incubated by Immersive Technology and Entrepreneurship Labs (ITEL), announced a targeted compression of the open-source MiniMax M2 model that the company says achieves world-best LLM performance on a single GPU. The work was led by a small Indian team under Arjun Reddy and supported by ITEL chair Prof. Ashok Jhunjhunwala. The project focused on reducing the GPU, Memory, and energy costs associated with deploying Large Language Models for real coding and full-repo reasoning in colleges and enterprises, citing the cost and scale of hardware such as H200 GPUs as a barrier.

VibeStudio describes the new method as THRIFT: Targeted Hierarchical Reduction for Inference and Fine-Tuning. According to the announcement, THRIFT audits the model layer by layer to identify redundant experts, silent activation routes, and dead parameters, and then applies calibrated staged pruning with teacher-guided fine-tuning after each stage. The reported outcome is a 55% size reduction of MiniMax M2 while retaining 80% of the original model’s reasoning strength and coding precision, and in many cases delivering faster responsiveness. VibeStudio says it has released the pruned M2 on HuggingFace as open-source and that the release has crossed 150,000+ downloads to date.

Beyond the open release, VibeStudio retains two private foundational models for enterprise deployments: an 8B Dense Model optimised for quantised local use on mainstream hardware and a 32B A3B MoE Model built for secure, high-speed, on-premises reasoning. Those models remain closed and are exclusive to enterprise partners. VibeStudio positions its Agentic IDE and THRIFT-compressed models as a path to deliver powerful, affordable Artificial Intelligence-enabled coding tools to a wide range of users, from large companies to first-year engineering students on budget laptops, emphasizing engineering efficiency over continually scaling model size.

58

Impact Score

Report finds California creative job losses are not driven by Artificial Intelligence

New research from Otis College of Art and Design finds California’s recent creative industry job losses stem from cost pressures and structural shifts, not direct worker displacement by generative Artificial Intelligence. The technology is changing workflows and expectations, but it is largely replacing tasks rather than entire jobs.

U.S. senators propose broader chip tool export ban for Chinese firms

A bipartisan proposal in the U.S. Senate would shift semiconductor equipment controls from specific fabs to targeted Chinese companies and their affiliates. The measure is aimed at cutting off access to advanced lithography and other wafer fabrication tools for firms such as Huawei, SMIC, YMTC, CXMT, and Hua Hong.

Trump executive order targets state Artificial Intelligence laws

Executive Order 14365 lays out a federal strategy to discourage, challenge, and potentially preempt state Artificial Intelligence laws viewed as burdensome. Employers are advised to keep complying with current state and local rules while preparing for regulatory uncertainty in 2026.

Who decides how America uses Artificial Intelligence in war

Stanford experts are divided over how the United States should govern Artificial Intelligence in defense, surveillance, and warfare. Their views converge on one point: decisions with such high stakes cannot be left to companies alone.

GPUBreach bypasses IOMMU on GDDR6-based NVIDIA GPUs

Researchers from the University of Toronto describe GPUBreach, a rowhammer attack against GDDR6-based NVIDIA GPUs that can bypass IOMMU protections. The technique enables CPU-side privilege escalation by abusing trusted GPU driver behavior on the host system.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.