NVIDIA OpenShell boosts local AI agents on RTX PCs

What NVIDIA OpenShell Is and Why It Matters

NVIDIA OpenShell is a secure runtime for local AI agents on Windows that combines Windows security primitives with NVIDIA policy controls to run private, on-device AI workflows without relying on the cloud. Built in partnership with Microsoft, it gives developers an easy-to-deploy package so agents can operate directly on RTX PCs while staying under clear user control. OpenShell adds policy tools that let users define what an agent is allowed to do, decide when to route queries to local models, and mask personal information before anything leaves the device. This focus on privacy and control addresses a core barrier for local AI agents: the need to run continuously on a primary PC without exposing sensitive data. By tying identity, containment and policy together, OpenShell lays the groundwork for safe, scalable Windows AI processing.

2x llama.cpp Inference Performance on RTX PCs

NVIDIA’s work around NVIDIA OpenShell goes beyond security to direct performance gains for local AI agents. Through collaboration with the llama.cpp community, NVIDIA introduced multi-token prediction and optimizations like programmatic dependent launch, delivering 2x inference throughput on Qwen 3.6 and 3.5 27B models and 1.6x on Qwen 3.6 and 3.5 35B. These llama.cpp inference gains help local AI agents respond faster and handle more complex tasks without needing a data center. According to NVIDIA, “Qwen3.6-27B delivers up to 2x throughput and Qwen3.6-35B up to 1.6x on GeForce RTX 5090, accelerating local agentic AI workloads through open source community collaboration.” Enthusiasts with multi-GPU rigs benefit further: llama.cpp now includes tensor parallelism that brings up to 2x effective memory and 1.8x compute on two equivalent GPUs, extending RTX PC performance for heavier models and longer sessions.

RTX Spark and DGX Station: PCs Built for Local AI Agents

OpenShell arrives alongside new hardware tuned for local AI agents, led by NVIDIA RTX Spark and DGX Station for Windows. RTX Spark is a new class of Windows PC with up to 1 petaflop of AI compute and 128GB of unified memory, designed to shift a system from “tool to teammate” by running personal AI agents continuously and efficiently. It targets creators, gamers and developers who need powerful Windows AI processing in slim laptops or efficient desktops with long battery life. DGX Station for Windows extends this concept to deskside AI supercomputers, bringing data-center-class GPUs and CPUs into a managed Windows environment. Together with NemoClaw blueprints and streamlined installers, these platforms make it easier to deploy local AI agents across GeForce RTX, RTX PRO, RTX Spark, DGX Spark and DGX Station, with consistent security and performance behavior across devices.

Secure Local AI Agents for Enterprises and Power Users

NVIDIA OpenShell is central to making local AI agents practical for both individual creators and enterprises. It builds on new Windows security primitives for identity, containment, policy and end-to-end protection so agents can run natively while respecting organizational rules and user consent. OpenShell adds fine-grained policy controls, intelligent routing to local models based on privacy settings, and on-the-fly masking of personal information before queries touch any cloud service. This security layer is already being adopted by leading agent developers Hermes Agent and OpenClaw, which are integrating OpenShell and the new Windows primitives into their Windows apps. These agents can operate Windows applications, manage multi-step workflows, generate media, write code and search local files, all while remaining under user-defined constraints. For enterprises, the ability to keep sensitive workflows local and auditable is a major reason to consider RTX PCs as AI-capable endpoints rather than thin clients to cloud models.

Adobe, H Company and the Future of Creative RTX Workflows

NVIDIA is lining up software partners around OpenShell and RTX Spark to reshape creative and productivity workflows. Adobe is rearchitecting Photoshop and Premiere for RTX systems, rebuilding them to exploit RTX Spark’s AI compute and memory so features like Firefly-based tools can run more efficiently on-device. At the same time, Blender is adding DLSS 4.5 Ray Reconstruction and NVIDIA is introducing RTX Video Frame Generation, which will arrive in ComfyUI, giving creators higher-quality rendering and video processing locally. H Company is extending agent capabilities with a computer-use harness that lets agents “see” the desktop and operate keyboard and mouse, even for applications without APIs. NVIDIA collaborated with H Company to quantize Holo Computer Use models and accelerate the harness, cutting memory use by 35% and doubling speed on NVIDIA GPUs. Together, these moves set up RTX PCs as powerful hubs for local AI agents and creative workflows without cloud dependence.