OpenAI launches open gpt-oss models optimized for NVIDIA RTX GPUs

OpenAI´s new gpt-oss models, optimized for NVIDIA GPUs, accelerate local artificial intelligence applications across RTX-powered devices.

OpenAI, in a partnership with NVIDIA, has introduced new open-source gpt-oss language models designed for seamless deployment on NVIDIA´s RTX and RTX PRO GPUs. The gpt-oss-20b and gpt-oss-120b models are tailored for versatile reasoning tasks, supporting applications ranging from web search and coding assistance to comprehensive document analysis. These models, engineered for flexible local and cloud inference, can handle up to 131,072 context lengths—enabling sophisticated chain-of-thought and instruction-following capabilities. NVIDIA optimizations ensure top performance, reportedly achieving up to 256 tokens per second on the high-end GeForce RTX 5090 GPU.

Developers and artificial intelligence enthusiasts can access and run these models on RTX-powered machines via popular frameworks and tools such as Ollama, llama.cpp, and Microsoft AI Foundry Local. Ollama, in particular, offers a streamlined user experience, facilitating out-of-the-box support for OpenAI´s open-weight models with a modern interface and features like multimodal inputs and file integration. The models leverage a mixture-of-experts architecture and take advantage of the MXFP4 precision format, which boosts efficiency and reduces resource demands without sacrificing model quality. The training took place on NVIDIA H100 GPUs, underscoring the scalability and performance of the NVIDIA hardware ecosystem from cloud infrastructures to localized desktop PCs.

NVIDIA has actively engaged the open-source community to further refine model performance on their GPUs, contributing improvements like CUDA Graph implementations and CPU overhead reductions to projects such as llama.cpp and the GGML tensor library. Windows developers also benefit from native access through Microsoft AI Foundry Local, which utilizes ONNX Runtime optimized with CUDA and plans forthcoming support for NVIDIA TensorRT. These advancements mark a significant opening for developers looking to embed high-performance artificial intelligence reasoning into Windows applications and signal a broader shift toward on-device artificial intelligence acceleration powered by deep collaborations between industry leaders like OpenAI and NVIDIA.

76

Impact Score

Nvidia acquisition of SchedMD raises Slurm neutrality concerns

Nvidia’s purchase of SchedMD has given it control of Slurm, an open-source scheduler that sits at the center of many supercomputing and large-model training systems. Researchers and engineers are watching for signs that support could tilt toward Nvidia hardware over AMD and Intel alternatives.

Mustafa Suleyman says Artificial Intelligence compute growth is still accelerating

Mustafa Suleyman argues that Artificial Intelligence development is being propelled by simultaneous advances in chips, memory, networking, and software efficiency rather than nearing a hard limit. He contends that rising compute capacity and falling deployment costs will push systems beyond chatbots toward more capable agents.

China and the US are leading different Artificial Intelligence races

The US leads in large language models and advanced chips, while China has built a major advantage in robotics and humanoid manufacturing. That balance is shifting as Chinese developers narrow the gap in model performance and both countries push to combine software and machines.

Congress weighs Artificial Intelligence transparency rules

Bipartisan lawmakers are pushing a federal transparency standard for the largest Artificial Intelligence models as Congress works on a broader national framework. The proposal aims to increase public trust while avoiding stricter state-by-state requirements and heavier regulation.

Report finds California creative job losses are not driven by Artificial Intelligence

New research from Otis College of Art and Design finds California’s recent creative industry job losses stem from cost pressures and structural shifts, not direct worker displacement by generative Artificial Intelligence. The technology is changing workflows and expectations, but it is largely replacing tasks rather than entire jobs.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.