PrismML launches 1-bit large language model family

PrismML has emerged from stealth with a $16.25 million seed round and an open source release of its 1-bit Bonsai large language models. The startup says the models sharply cut memory use and energy consumption while aiming to preserve performance on standard benchmarks.

PrismML has emerged from stealth with a $16.25 million seed round and an open source release of what it describes as a “1-bit” large language model family. Founded by Caltech researchers, the company is targeting one of the central pressures in Artificial Intelligence infrastructure: rising memory constraints and energy costs. Its pitch is to compress the model itself rather than only optimizing surrounding inference components.

The Bonsai model family’s flagship model is Bonsai 8B, an 8-billion-parameter model trained on Google v4 TPUs. According to PrismML, the model achieves competitive performance on benchmark suites including MMLU Redux, MuSR, GSM8K, HumanEval+, IFEval, and BFClv3, but with a memory footprint of roughly 1GB, compared to about 16GB for a typical 16-bit equivalent. PrismML is also releasing 1-bit Bonsai 4B and 1.7B models, with 0.5GB and 0.24GB memory footprint, respectively. The company says the models are fully binarized end to end, with all weights constrained to a single bit across embeddings, attention layers, and MLP blocks, without higher-precision exceptions.

PrismML attributes the results to a new mathematical framework developed at Caltech, although it has not disclosed the training methods or stabilization techniques behind the approach. CEO Babak Hassibi described the work as a new paradigm for Artificial Intelligence designed to adapt across diverse hardware environments. The company claims its 1-bit models can deliver up to eight times faster processing and reduce energy consumption by as much as 75 to 80% on existing hardware. PrismML also argues that future hardware optimized for 1-bit operations could improve efficiency further by replacing more complex multiplications with simpler arithmetic.

The company and its investors frame the technology as a way to move advanced Artificial Intelligence beyond centralized data centers and onto consumer and edge devices. PrismML says the models are designed to run on smartphones, wearables, and robotics, potentially enabling more capable local deployments without depending on cloud infrastructure. In a blog post, the company also introduced “intelligence density,” a metric intended to measure how much capability a model delivers per unit of size.

Key questions remain unresolved. PrismML’s claim that a fully 1-bit model can match higher-precision systems has not been validated beyond the company’s own benchmark results, and extreme quantization has historically struggled with complex reasoning tasks. Independent third-party testing and real-world deployments will determine whether PrismML’s approach is a genuine breakthrough or a narrower efficiency optimization. Even so, the launch underscores the industry’s broader shift toward efficiency-focused Artificial Intelligence design as model scaling becomes more expensive.

58

Impact Score

Google launches Gemma 4 open model family

Google has introduced Gemma 4, a new family of open-weight Artificial Intelligence models focused on advanced reasoning and multimodal capabilities. The release expands the Gemma line with broader deployment options, stronger performance claims and a more permissive open source license.

Fda shifts its breakthrough standard for clinical Artificial Intelligence

The Food and Drug Administration appears to be raising the bar for what qualifies as a breakthrough clinical Artificial Intelligence device. Priority is increasingly going to systems that address broad, complex medical problems rather than tools that simply improve physicians’ existing capabilities.

Mercor links cyberattack to LiteLLM compromise

Mercor said a cyberattack was tied to the compromise of LiteLLM, prompting wider discussion about supply chain risk and the limits of compliance programs. The incident also led LiteLLM to change its compliance processes and move from Delve to Vanta for compliance certifications.

Rowhammer attack targets NVIDIA GPUs with GDDR6

New research shows Rowhammer exploits can target NVIDIA GPUs using GDDR6 memory and extend beyond the graphics subsystem into host CPU memory. The attacks can corrupt GPU page tables and lead to full system compromise.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.