OpenRouter highlights expanding roster of free artificial intelligence models

OpenRouter is expanding free access to high-end artificial intelligence models, aggregating open-weight and frontier systems from multiple providers under a single routing layer. The lineup targets agentic, long-context, multimodal, and code-centric workloads while keeping usage at $0/M input tokens and $0/M output tokens for listed models.

OpenRouter is positioning free models as a core part of its strategy to democratize access to artificial intelligence, emphasizing that these systems enable hundreds of thousands of users to experiment, learn, and build applications without upfront cost. The platform is actively onboarding new providers and directly covering costs to expand free capacity, while maintaining a router endpoint, openrouter/free, that automatically selects among available free models based on request requirements. All highlighted models in the collection advertise $0/M input tokens and $0/M output tokens, with context windows that extend from tens of thousands of tokens to 1.05M context for the largest entries.

The current roster spans a wide range of architectures and use cases, with a strong focus on sparse Mixture of Experts designs and agentic workloads. StepFun’s Step 3.5 Flash (free) is described as the company’s most capable open-source foundation model, using a sparse Mixture of Experts architecture that activates only 11B of its 196B parameters per token for speed-efficient reasoning across a 256K context. Arcee AI’s Trinity-Large-Preview is a 400B-parameter sparse Mixture of Experts model with 13B active parameters per token and a native context window up to 512k tokens, currently served at 128k context, aimed at creative chat, storytelling, and agent harnesses such as OpenCode and Cline. Arcee also offers Trinity Mini, a 26B-parameter model with 3B active parameters that uses 128 experts and targets long-context reasoning at 131K context with strong function calling and multi-step workflows.

Several models are tailored for long-horizon, tool-using agents and specialized reasoning. Hunter Alpha is described as a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use and long-horizon planning, while NVIDIA’s Nemotron 3 Super is a 120B-parameter open hybrid Mixture of Experts model that activates 12B parameters and features a 1M token context window, multi-token prediction, and multi-environment reinforcement learning. NVIDIA also provides Nemotron 3 Nano 30B A3B, a small Mixture of Experts model focused on private, on-premise deployment, Nemotron Nano 9B V2 for unified reasoning and non-reasoning tasks with controllable reasoning traces, and Nemotron Nano 2 VL, a 12-billion-parameter multimodal reasoning model that scores ≈ 74 average across benchmarks such as MMMU, MathVista, AI2D, OCRBench, OCR-Reasoning, ChartQA, DocVQA, and Video-MME. OpenRouter’s own Healer Alpha is presented as a frontier omni-modal model that combines vision, hearing, reasoning, and action for real-world agentic intelligence, with a note that prompts and completions may be logged by the provider.

The catalog includes code and developer-focused systems alongside general-purpose assistants. Z.ai’s GLM-4.5-Air offers hybrid inference modes with a “thinking mode” for advanced reasoning and a “non-thinking mode” for real-time interaction, with behavior controlled by a reasoning enabled boolean. Qwen’s Qwen3-Coder-480B-A35B-Instruct is a Mixture of Experts code generation model with 480 billion total parameters and 35 billion active per forward pass, optimized for agentic coding, repository-scale context, and tool use, while Qwen3-Next-80B-A3B-Instruct targets fast, stable chat responses without visible thinking traces for production retrieval-augmented generation and multi-turn workflows. Meta contributes Llama 3.3 70B Instruct, a multilingual 70B-parameter instruction-tuned model optimized for dialogue across English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai, and OpenAI offers gpt-oss-120b, a 117B-parameter Mixture of Experts model with 5.1B active parameters per forward pass, configurable reasoning depth, and native tool use. The lineup is rounded out by LiquidAI’s LFM2.5-1.2B-Thinking, a 1.2B-parameter long-context reasoning model tuned for edge deployment with up to 32K tokens, and Mistral Small 3.1 24B Instruct, a 24 billion parameter multimodal model with a 128k token context window for image analysis, programming, mathematical reasoning, and multilingual tasks.

62

Impact Score

Physical artificial intelligence emerges as manufacturing’s next competitive edge

Manufacturers are moving beyond traditional automation toward physical artificial intelligence that can perceive, reason, and act in real factories, with Microsoft and NVIDIA positioning their technologies as the backbone for this shift. Trust, governance, and human oversight are presented as core requirements for scaling these systems safely.

Weird World column explores strange frontiers of science and society

Research in the Weird World: Science & Society section spans ethical risks of Artificial Intelligence therapy, ancient plagues decoded through DNA, climate shocks that reshaped civilizations, and other unconventional investigations at the edge of science and culture.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.