March 2026 brings a wave of Artificial Intelligence model launches

A concentrated burst of model releases in March 2026 highlighted rapid gains in language, video, coding, and on-device systems. Open-weight and open-source entrants narrowed the gap with proprietary platforms across several important benchmarks.

The first weeks of March 2026 brought a dense cluster of major Artificial Intelligence releases across the US, China, and Europe, spanning language models, video generation, spatial reasoning, GPU tooling, and diffusion acceleration. The lineup included GPT-5.4, LTX 2.3, FireRed Edit 1.1, Kiwi Edit, HY WU, Qwen 3.5 Small Series, CUDA Agent, CubeComposer, Helios, Spatial T2I, Spectrum, Utonia, and more, with NVIDIA adding Nemotron 3 Super on March 11. The central shift was not just volume. Open-weight and open-source systems moved much closer to proprietary frontier models on several headline benchmarks, while also improving cost efficiency and deployability.

OpenAI introduced GPT-5.4 on March 5, 2026 in Standard, Thinking, and Pro variants, with context windows up to 1.05 million tokens. On factual accuracy, GPT-5.4 reduces individual claim errors by 33% and full-response errors by 18% compared to GPT-5.2. It scored 83% on OpenAI’s GDPval benchmark for knowledge work. For coding specifically, it hits 57.7% on SWE-Bench Pro, just above GPT-5.3-Codex’s 56.8%, with lower latency. Pricing: $2.50 per 1M input tokens and $15.00 per 1M output tokens for standard context. There’s a 2x surcharge beyond 272K tokens. A notable addition is Tool Search, which dynamically retrieves tool definitions instead of loading them all into the prompt, reducing cost and latency for complex agentic systems.

Alibaba’s Qwen 3.5 Small family arrived on March 1, 2026 with 0.8B, 2B, 4B, and 9B parameter models, all natively multimodal and licensed under Apache 2.0. The 9B model led the release narrative. On GPQA Diamond, it scores 81.7 versus GPT-OSS-120B’s 71.5. On HMMT Feb 2025, it hits 83.2 versus GPT-OSS-120B’s 76.7. On MMLU-Pro, it reaches 82.5 versus 80.8. On Video-MME with subtitles, the 9B scores 84.5, significantly ahead of Gemini 2.5 Flash-Lite at 74.6. The architecture combines Gated Delta Networks with sparse Mixture-of-Experts, enabling a 262K native context window, extensible to 1M via YaRN. The 2B model runs on an iPhone in airplane mode with 4 GB of RAM. Qwen 3.5 via API costs approximately $0.10 per 1M input tokens, versus Claude Opus 4.6 at roughly 13x that price.

Open-source video also advanced sharply. Lightricks released LTX 2.3, a 22-billion-parameter Diffusion Transformer that generates synchronized video and audio in a single forward pass, supports up to 4K at 50 FPS, and runs up to 20 seconds of video. The distilled variant runs in just 8 denoising steps. Helios, developed by Peking University, ByteDance, and Canva, is a 14-billion-parameter autoregressive diffusion model that generates videos up to 1,440 frames (approximately 60 seconds at 24 FPS) at 19.5 FPS on a single NVIDIA H100 GPU under Apache 2.0. NVIDIA’s Nemotron 3 Super added a strong enterprise coding option: a 120-billion-total-parameter model with 12 billion active parameters per forward pass. Nemotron 3 Super scores 60.47% on SWE-Bench Verified, versus GPT-OSS’s 41.90%. On RULER at 1M tokens, it scores 91.75% versus GPT-OSS’s 22.30%. It delivers 2.2x higher throughput than GPT-OSS-120B and 7.5x higher throughput than Qwen3.5-122B. The broader takeaway is that efficient, open, and edge-deployable systems are becoming viable foundations for products in coding, video, and local inference.

70

Impact Score

OpenAI launches Artificial Intelligence deployment consulting unit

OpenAI has created a new consulting and deployment business aimed at helping enterprises build and roll out Artificial Intelligence systems. The move mirrors a similar push by Anthropic and signals a broader effort by model providers to capture more of the enterprise services market.

SK Group warns DRAM shortages could curb memory use

SK Group chairman Chey Tae-won warned that customers may reduce memory consumption through infrastructure and software optimization if DRAM suppliers fail to raise output. Demand from Artificial Intelligence data centers is keeping the market tight as memory makers weigh expansion against the long timelines for new fabs.

BitUnlocker bypasses TPM-only Windows 11 BitLocker

Intrinsec disclosed BitUnlocker, a downgrade attack that can bypass TPM-only Windows 11 BitLocker protections with physical access to a machine. The technique abuses a flaw in Windows recovery and deployment components and relies on older trusted boot code.

Micron samples 256 GB DDR5 9200 MT/s RDIMM server modules

Micron has begun sampling 256 GB DDR5 RDIMM server modules built on its 1-gamma technology to key ecosystem partners. The company positions the new modules as a higher-speed, more power-efficient option for scaling next-generation Artificial Intelligence and HPC infrastructure.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.