NVIDIA is broadening its push into local personal agents with RTX Spark, a new class of Windows PCs built for on-device agent workloads, and DGX Station for Windows, a deskside system aimed at professional inference workloads. The company is positioning local agents as private, secure tools that can interact with applications, automate tasks, generate content, and manage cross-app workflows while staying under user control. RTX Spark features up to 1 petaflop of Artificial Intelligence compute and 128GB of unified memory to meet the processing demands of on-device agents.
The Windows effort is centered on a partnership with Microsoft that combines new Windows security primitives with the NVIDIA OpenShell runtime. The platform adds identity, containment, policy, and end-to-end security capabilities for native agents, while OpenShell lets users define what agents can and cannot do, route prompts to local models based on privacy policies, and mask personal information before queries are sent to cloud models. Hermes Agent and OpenClaw are integrating these capabilities into new Windows applications. NVIDIA is also expanding the NemoClaw blueprint across GeForce RTX, RTX PRO, RTX, DGX Spark, and DGX Station, with streamlined installers and support for Hermes Agent.
NVIDIA is pairing that platform work with inference and multi-GPU optimizations across open source tools. The company said its collaboration with the llama.cpp community enables multi-token prediction and other changes that deliver 2x performance on Qwen 3.6 and 3.5 27B, and a 1.6x performance boost on Qwen 3.6 and 3.5 35B. For multi-GPU systems, llama.cpp adds tensor parallelism for up to 2x memory and 1.8x compute on two equivalent GPUs. ComfyUI gains a new classifier-free guidance method for up to 2x performance on two equivalent GPUs, plus model-chain splitting across GPUs. NVIDIA also said H Company’s computer-use harness is coming to RTX and DGX PCs, while collaboration on Holo Computer Use models drives a 2x speedup on NVIDIA GPUs while reducing memory consumption by 35%.
On Linux, DGX Spark is being positioned as a local agent system for developers who need a CUDA-compatible environment with large memory and fast compute. NemoClaw is now available for all NVIDIA RTX and DGX PCs on Linux and the Windows Subsystem for Linux. NVIDIA said its work with vLLM and optimized NVFP4 checkpoints for Qwen 3.6 35B delivers 2.6x performance on DGX Spark compared with the previously available NVFP4 checkpoints from Unsloth, along with kernel improvements, mixed precision, and CUDA Graph support for MTP.
Creative applications are part of the rollout as well. Adobe is reworking Premiere and Photoshop for RTX Spark, with Firefly-powered Generative Fill and Generative Extend among the accelerated tools, and NVIDIA said the platform delivers up to 2x faster Artificial Intelligence, editing, coloring and effects across creative workflows. Substance 3D Painter and Stager will also run natively on RTX Spark. Across the wider RTX ecosystem, NVIDIA Broadcast 2.2 moves Studio Voice out of beta and adds Stream Deck support, Project G-Assist gains Stream Deck integration, Blender 5.3 will add DLSS 4.5 Ray Reconstruction this fall, and RTX Video Frame Generation will launch with RTX Spark to double or quadruple video frame rate in real time, helping smooth low-fps model outputs such as 15-20 frames-per-second (fps).
