OpenRouter highlights expanding roster of free artificial intelligence models

OpenRouter is expanding free access to high-end artificial intelligence models, aggregating open-weight and frontier systems from multiple providers under a single routing layer. The lineup targets agentic, long-context, multimodal, and code-centric workloads while keeping usage at $0/M input tokens and $0/M output tokens for listed models.

OpenRouter is positioning free models as a core part of its strategy to democratize access to artificial intelligence, emphasizing that these systems enable hundreds of thousands of users to experiment, learn, and build applications without upfront cost. The platform is actively onboarding new providers and directly covering costs to expand free capacity, while maintaining a router endpoint, openrouter/free, that automatically selects among available free models based on request requirements. All highlighted models in the collection advertise $0/M input tokens and $0/M output tokens, with context windows that extend from tens of thousands of tokens to 1.05M context for the largest entries.

The current roster spans a wide range of architectures and use cases, with a strong focus on sparse Mixture of Experts designs and agentic workloads. StepFun’s Step 3.5 Flash (free) is described as the company’s most capable open-source foundation model, using a sparse Mixture of Experts architecture that activates only 11B of its 196B parameters per token for speed-efficient reasoning across a 256K context. Arcee AI’s Trinity-Large-Preview is a 400B-parameter sparse Mixture of Experts model with 13B active parameters per token and a native context window up to 512k tokens, currently served at 128k context, aimed at creative chat, storytelling, and agent harnesses such as OpenCode and Cline. Arcee also offers Trinity Mini, a 26B-parameter model with 3B active parameters that uses 128 experts and targets long-context reasoning at 131K context with strong function calling and multi-step workflows.

Several models are tailored for long-horizon, tool-using agents and specialized reasoning. Hunter Alpha is described as a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use and long-horizon planning, while NVIDIA’s Nemotron 3 Super is a 120B-parameter open hybrid Mixture of Experts model that activates 12B parameters and features a 1M token context window, multi-token prediction, and multi-environment reinforcement learning. NVIDIA also provides Nemotron 3 Nano 30B A3B, a small Mixture of Experts model focused on private, on-premise deployment, Nemotron Nano 9B V2 for unified reasoning and non-reasoning tasks with controllable reasoning traces, and Nemotron Nano 2 VL, a 12-billion-parameter multimodal reasoning model that scores ≈ 74 average across benchmarks such as MMMU, MathVista, AI2D, OCRBench, OCR-Reasoning, ChartQA, DocVQA, and Video-MME. OpenRouter’s own Healer Alpha is presented as a frontier omni-modal model that combines vision, hearing, reasoning, and action for real-world agentic intelligence, with a note that prompts and completions may be logged by the provider.

The catalog includes code and developer-focused systems alongside general-purpose assistants. Z.ai’s GLM-4.5-Air offers hybrid inference modes with a “thinking mode” for advanced reasoning and a “non-thinking mode” for real-time interaction, with behavior controlled by a reasoning enabled boolean. Qwen’s Qwen3-Coder-480B-A35B-Instruct is a Mixture of Experts code generation model with 480 billion total parameters and 35 billion active per forward pass, optimized for agentic coding, repository-scale context, and tool use, while Qwen3-Next-80B-A3B-Instruct targets fast, stable chat responses without visible thinking traces for production retrieval-augmented generation and multi-turn workflows. Meta contributes Llama 3.3 70B Instruct, a multilingual 70B-parameter instruction-tuned model optimized for dialogue across English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, and OpenAI offers gpt-oss-120b, a 117B-parameter Mixture of Experts model with 5.1B active parameters per forward pass, configurable reasoning depth, and native tool use. The lineup is rounded out by LiquidAI’s LFM2.5-1.2B-Thinking, a 1.2B-parameter long-context reasoning model tuned for edge deployment with up to 32K tokens, and Mistral Small 3.1 24B Instruct, a 24 billion parameter multimodal model with a 128k token context window for image analysis, programming, mathematical reasoning, and multilingual tasks.

62

Impact Score

Europe’s Artificial Intelligence challenge is structural dependence

Europe has talent, research strength, and rising investment in Artificial Intelligence, but startups remain reliant on American infrastructure, platforms, and late-stage capital. The argument centers on digital sovereignty, interoperability, and ownership as the conditions for building durable European champions.

Community backlash slows Artificial Intelligence data center expansion

Political resistance, regulatory scrutiny, and rising energy and water concerns are complicating the build-out of large Artificial Intelligence data centers across the United States. The pressure is increasing costs, delaying projects, and adding fresh risks to the economics behind Generative Artificial Intelligence infrastructure.

House panel advances export controls after China report

The House Foreign Affairs Committee moved export control legislation after a House Select Committee report detailed China’s use of illegal means to build its Artificial Intelligence and semiconductor sectors. The measure is aimed at chip smuggling and Artificial Intelligence model theft.

Intel repurposes scrap dies to expand CPU supply

Intel is repurposing wafer-edge and lower-yield silicon that would normally be discarded into sellable CPUs as industry demand outpaces supply. The strategy reflects a market where customers are willing to buy lower-tier parts to secure any available capacity.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.