RTX Spark PCs and Local AI Agents Explained

What RTX Spark PCs Are and Why They Matter

RTX Spark PCs are a new class of Windows computers that integrate Nvidia’s RTX GPUs and Grace CPUs to run local AI agents directly on the device, turning traditional PCs into active assistants that can understand context, automate tasks, and process media without constant cloud access. Designed by Nvidia and Microsoft, these systems combine a Blackwell-based RTX GPU, a 20-core Grace CPU co-developed with MediaTek, and up to 128GB of unified memory to deliver as much as 1 petaflop of AI performance. Nvidia calls this a reinvention of the PC, shifting it from a static “box of apps” to a system that can act on a user’s behalf across workflows. Major OEMs including Microsoft, Dell, HP, Lenovo, ASUS, and MSI are expected to ship RTX Spark laptops and desktops this fall, targeting creators, developers, and gamers who want powerful on-device inference without relying on remote servers.

Local AI Agents: From Tool to Teammate on Windows

Local AI agents on RTX Spark PCs are designed to run inside Windows applications, observe user workflows, and execute multi-step tasks while staying under user control. These agents can search local files semantically, generate images and video, write or debug code, and coordinate work across multiple apps. Nvidia’s partnership with Microsoft extends from personal systems to the new DGX Station for Windows, which brings data-center-class GPUs into a deskside form factor for professional AI workloads. A key goal is to move personal agents from experimental projects into dependable daily tools that feel like teammates rather than separate programs. By running on-device inference, RTX Spark PCs reduce latency for tasks like content generation or video editing and avoid the overhead of sending every interaction to the cloud, making AI assistance feel more immediate and responsive in everyday computing.

RTX Spark PCs Bring Local AI Agents to Everyday Windows Desktops

Nvidia OpenShell and Windows Security: Privacy-First Agent Design

Nvidia OpenShell sits at the heart of RTX Spark’s security story, bringing a dedicated runtime for local AI agents to Windows that aligns with new OS-level security primitives. The Windows platform now provides identity, containment, policy, and end-to-end security for agents, while OpenShell adds user-facing controls that define what agents can and cannot do. It can route queries to local models based on privacy rules and disguise personal information in those requests that must go to cloud models. According to Nvidia, OpenShell will be integrated into popular agent frameworks such as Hermes Agent and OpenClaw, giving developers an easy way to package secure, on-device agents. The result is a system where agents can, for example, read documents, operate apps, or automate workflows without exposing sensitive data, reinforcing the idea that local AI agents should be both powerful and private by default.

Performance Gains: 2x Inference on llama.cpp and Multi-GPU Optimizations

RTX Spark PCs emphasize performance for on-device inference, especially for open models that power local AI agents. Nvidia collaborated with the llama.cpp community to add multi-token prediction, a speculative decoding technique where a smaller draft model proposes multiple tokens at once and a larger model verifies them in a single pass. This, plus optimizations like programmatic dependent launch, delivers up to 2x throughput on Qwen 3.6 and 3.5 27B models and about 1.6x gains on Qwen 3.6 and 3.5 35B. These improvements are available through tools such as the llama.cpp web UI and LM Studio, and are tuned for GeForce RTX GPUs. For enthusiasts with multi-GPU rigs, llama.cpp now supports tensor parallelism, providing up to 2x memory and 1.8x compute on two equivalent GPUs, while ComfyUI can split model chains across GPUs, improving generation times for local agent workloads.

What RTX Spark Means for Apps, Creators, and Everyday Users

RTX Spark’s impact goes beyond core hardware to the software ecosystem. Adobe is rearchitecting Photoshop and Premiere to exploit RTX Spark’s unified memory and AI performance, targeting faster effects, smarter content tools, and better memory usage on Windows AI acceleration hardware. Blender is adding DLSS 4.5 Ray Reconstruction, and Nvidia is rolling out RTX Video Frame Generation to ComfyUI, all planned for release alongside RTX Spark systems this fall. H Company is working with Nvidia to ship computer-use tools that let agents see the screen and operate mouse and keyboard input, even for applications without APIs, powered by optimized Holo Computer Use models. For everyday users, this means local AI agents that can drive complex creative workflows, automate repetitive tasks across apps, and respond with low latency, all while keeping sensitive data on-device instead of relying on cloud-only AI services.