OpenRouter highlights expanding roster of free artificial intelligence models

March 14, 2026

OpenRouter is expanding free access to high-end artificial intelligence models, aggregating open-weight and frontier systems from multiple providers under a single routing layer. The lineup targets agentic, long-context, multimodal, and code-centric workloads while keeping usage at $0/M input tokens and $0/M output tokens for listed models.

OpenRouter is positioning free models as a core part of its strategy to democratize access to artificial intelligence, emphasizing that these systems enable hundreds of thousands of users to experiment, learn, and build applications without upfront cost. The platform is actively onboarding new providers and directly covering costs to expand free capacity, while maintaining a router endpoint, openrouter/free, that automatically selects among available free models based on request requirements. All highlighted models in the collection advertise $0/M input tokens and $0/M output tokens, with context windows that extend from tens of thousands of tokens to 1.05M context for the largest entries.

The current roster spans a wide range of architectures and use cases, with a strong focus on sparse Mixture of Experts designs and agentic workloads. StepFun’s Step 3.5 Flash (free) is described as the company’s most capable open-source foundation model, using a sparse Mixture of Experts architecture that activates only 11B of its 196B parameters per token for speed-efficient reasoning across a 256K context. Arcee AI’s Trinity-Large-Preview is a 400B-parameter sparse Mixture of Experts model with 13B active parameters per token and a native context window up to 512k tokens, currently served at 128k context, aimed at creative chat, storytelling, and agent harnesses such as OpenCode and Cline. Arcee also offers Trinity Mini, a 26B-parameter model with 3B active parameters that uses 128 experts and targets long-context reasoning at 131K context with strong function calling and multi-step workflows.

Several models are tailored for long-horizon, tool-using agents and specialized reasoning. Hunter Alpha is described as a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use and long-horizon planning, while NVIDIA’s Nemotron 3 Super is a 120B-parameter open hybrid Mixture of Experts model that activates 12B parameters and features a 1M token context window, multi-token prediction, and multi-environment reinforcement learning. NVIDIA also provides Nemotron 3 Nano 30B A3B, a small Mixture of Experts model focused on private, on-premise deployment, Nemotron Nano 9B V2 for unified reasoning and non-reasoning tasks with controllable reasoning traces, and Nemotron Nano 2 VL, a 12-billion-parameter multimodal reasoning model that scores ≈ 74 average across benchmarks such as MMMU, MathVista, AI2D, OCRBench, OCR-Reasoning, ChartQA, DocVQA, and Video-MME. OpenRouter’s own Healer Alpha is presented as a frontier omni-modal model that combines vision, hearing, reasoning, and action for real-world agentic intelligence, with a note that prompts and completions may be logged by the provider.

The catalog includes code and developer-focused systems alongside general-purpose assistants. Z.ai’s GLM-4.5-Air offers hybrid inference modes with a “thinking mode” for advanced reasoning and a “non-thinking mode” for real-time interaction, with behavior controlled by a reasoning enabled boolean. Qwen’s Qwen3-Coder-480B-A35B-Instruct is a Mixture of Experts code generation model with 480 billion total parameters and 35 billion active per forward pass, optimized for agentic coding, repository-scale context, and tool use, while Qwen3-Next-80B-A3B-Instruct targets fast, stable chat responses without visible thinking traces for production retrieval-augmented generation and multi-turn workflows. Meta contributes Llama 3.3 70B Instruct, a multilingual 70B-parameter instruction-tuned model optimized for dialogue across English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, and OpenAI offers gpt-oss-120b, a 117B-parameter Mixture of Experts model with 5.1B active parameters per forward pass, configurable reasoning depth, and native tool use. The lineup is rounded out by LiquidAI’s LFM2.5-1.2B-Thinking, a 1.2B-parameter long-context reasoning model tuned for edge deployment with up to 32K tokens, and Mistral Small 3.1 24B Instruct, a 24 billion parameter multimodal model with a 128k token context window for image analysis, programming, mathematical reasoning, and multilingual tasks.

Source

62

Impact Score

Latest News

Anthropic limits Mythos models on Artificial Intelligence research tasks

June 11, 2026

Anthropic disclosed that its Mythos-based models can become less helpful on frontier large language model development work. Developers and researchers criticized the invisible limitations, arguing that degraded assistance without notice undermines trust.

DFlash accelerates large language model inference with block diffusion

June 11, 2026

DFlash uses block-diffusion speculative decoding to reduce large language model inference latency while keeping the target model as verifier. The workflow covers draft-model training, FlashAttention integration, and deployment through Regolo Custom Models.

NVIDIA speeds Google DeepMind DiffusionGemma for local Artificial Intelligence

June 11, 2026

Google DeepMind’s DiffusionGemma uses diffusion-style parallel text generation instead of token-by-token output. NVIDIA says its optimizations make the open model faster across local RTX, RTX PRO and DGX systems.

NVIDIA outlines Halos safety foundation for robotaxis

June 11, 2026

NVIDIA is positioning Halos OS as a production-ready safety layer for robotaxi deployments built on DRIVE Hyperion. The system combines certified software, standardized interfaces, verifiable Artificial Intelligence guardrails and large-scale validation tools.

Enhanced Games spotlight drugs, Mythos and Artificial Intelligence safeguards

June 11, 2026

The inaugural Enhanced Games cast performance-enhancing drugs as a vision of medical progress. Anthropic’s safer Mythos release and wider Artificial Intelligence policy disputes led a busy technology agenda.

OpenRouter highlights expanding roster of free artificial intelligence models

62

Impact Score

Latest News

Anthropic limits Mythos models on Artificial Intelligence research tasks

DFlash accelerates large language model inference with block diffusion

NVIDIA speeds Google DeepMind DiffusionGemma for local Artificial Intelligence

NVIDIA outlines Halos safety foundation for robotaxis

Enhanced Games spotlight drugs, Mythos and Artificial Intelligence safeguards

Contact Us