Hacker News discussion on IBM Granite 4.0 hybrid models, tooling support and early benchmarks

Developers on Hacker News dissect IBM’s Granite 4.0 large language models, focusing on the new hybrid Mamba and Transformer architecture, local-run options, and mixed early performance signals. The thread highlights rapid community tooling support alongside questions about real-world benchmarks and governance.

A Hacker News thread centers on IBM’s Granite 4.0 large language models and their hybrid Mamba and Transformer design, with commenters trading hands-on notes, links to official resources, and early impressions. Several participants pointed readers to IBM’s own announcement and a company explainer on the Mamba architecture, while noting that community write-ups can be more informative than high-level coverage. One commenter also highlighted IBM’s reference to ISO/IEC 42001 certification for Artificial Intelligence management systems and asked what concrete practices that implies in product design and deployment.

Tooling and local inference support featured prominently. Contributors reported that support for Granite 4’s hybrid architecture landed in llama.cpp earlier this year, and that Ollama’s engine uses GGML directly while falling back to llama.cpp for models it does not yet support. Unsloth released dynamic GGUF conversions for a 32 billion parameter mixture-of-experts variant and shared a support agent fine-tuning notebook. Users tested local setups across LM Studio, Vulkan and ROCm back ends, and different quantizations, with one noting a switch to ROCm resolved a GPU loading issue. Another user tried an Ollama package that ran quickly at roughly 1.9 GB download size, though without Mamba components and with default context limits lower than the claimed maximum.

Early performance anecdotes were mixed. A practitioner reported the small 32 billion parameter mixture-of-experts quantized build at around 19 GB, roughly 20 GB at 100,000 token context, about 26 GB of VRAM in one runtime and 22 GB in another, and around 30 tokens per second. The same commenter judged coding ability to be underwhelming in initial tests, later citing third-party dashboards showing approximately 25.1 percent on livecodebench, 2 percent on a terminal benchmark, and 16 percent on a coding index for one Granite 4.0 variant. Elsewhere in the thread, developers asked for head-to-head comparisons with leading closed models and noted that third-party benchmarks to date appear less favorable than vendor materials.

The conversation also placed Granite 4.0 in a broader context of long-context and hybrid architectures. Commenters referenced other systems, including a model with a 256,000 token context and a newly released model that slows markedly beyond 40,000 tokens. Some expressed caution rooted in past experiences with IBM’s Artificial Intelligence offerings and skepticism about marketing claims, while others praised the pace of open tooling and the ability to run models locally. Overall, the thread captures an active, early-stage evaluation: strong momentum in ecosystem support, clear enterprise and governance positioning, and a wait-and-see posture on independent benchmarks and real-world task performance.

55

Impact Score

Best artificial intelligence video tools for fast content creation in 2026

A new wave of artificial intelligence video tools is turning scripts, articles, and raw clips into polished videos in minutes, with options ranging from text-to-video generators to smart collaborative editors. Different platforms target use cases such as social clips, animated explainers, and repurposed written content.

179th cyber protection team uses artificial intelligence in defensive training

The 179th cyber protection team is integrating artificial intelligence into defensive cyber operations training to protect the digital infrastructure that enables large-scale combat operations. Soldiers are testing tools such as Gemini to speed planning while maintaining strict human oversight and decision authority.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.