NVIDIA Microsoft partnership for agentic AI deployment

What a Unified Agentic AI Stack Means

The unified NVIDIA and Microsoft agentic AI stack is a shared hardware, software and runtime architecture that lets developers design, test and deploy AI agents consistently across consumer Windows devices, edge systems and enterprise cloud infrastructure, so the same agentic applications can move between PCs, local servers and Azure without code forks or fragmented tooling. Announced at Microsoft Build in a keynote featuring Jensen Huang and Satya Nadella, the NVIDIA Microsoft partnership pushes Windows to the centre of agentic AI deployment. RTX Spark PCs, DGX Station for Windows, Azure Local, Microsoft Foundry and Microsoft Fabric now sit inside one enterprise AI stack, aimed at agentic AI deployment rather than isolated demos. For developers, this alignment matters because it promises consistent runtimes, policy control and performance characteristics for Windows AI agents whether they run on a battery-powered laptop, a deskside AI supercomputer or a multi-tenant cloud service.

NVIDIA and Microsoft Build a Unified Stack for Agentic AI

From RTX Spark PCs to DGX Station: Windows as Agent Platform

At the client edge, RTX Spark laptops and compact desktops turn Windows into a host for personal agents. NVIDIA describes RTX Spark as delivering 1 petaflop of AI performance and up to 128GB of unified memory, aimed at running long-lived Windows AI agents locally while still offering all-day battery life. Systems are expected from major PC brands, alongside a Surface RTX Spark Dev Box tuned for local model and agent workloads. Higher up the stack, DGX Station for Windows brings the same vision to deskside enterprise AI. Powered by the GB300 Grace Blackwell Ultra Desktop Superchip, it offers up to 748GB of coherent memory and 20 petaflops of FP4 performance, enough to run models of up to 1 trillion parameters on-premises. Both tiers use the same OpenShell runtime, so developers can move agents from RTX Spark prototypes to DGX-backed enterprise deployments without rethinking their core architecture.

Secure Runtimes and Open Models Across Azure and Local

Security and model access sit at the heart of the unified enterprise AI stack. OpenShell, NVIDIA’s secure-by-design runtime for autonomous agents, is coming to Windows on top of Microsoft Execution Containers, giving teams a policy-driven way to limit what agents can access at runtime. The same runtime underpins agents running on DGX Station for Windows and across Azure-hosted services. On the cloud side, Microsoft Foundry now exposes NVIDIA open models for agentic AI deployment, including the Nemotron 3 Ultra reasoning model, Nemotron 3.5 ASR and Nemotron 3.5 Content Safety. NVIDIA notes that Anthropic’s Claude models will run natively on GB300 Blackwell Ultra systems on Azure, with enterprises able to mix these with local models. NVIDIA Agent Toolkit and NemoClaw blueprints provide patterns for production agents, while CUDA-X libraries such as cuDF, cuOpt, AI-Q and NeMo become callable skills, so Azure-based agents can work directly over enterprise data and optimization workloads.

Data, Edge AI Computing and Hybrid Agent Deployment

Data throughput is a key bottleneck for edge AI computing and large-scale agents. NVIDIA accelerated computing is now integrated into Microsoft Fabric Data Warehouse, where Microsoft’s internal benchmarks show SQL execution up to 6x faster than a CPU-only baseline and up to 7x faster than three other leading cloud data warehouses for high-concurrency workloads. This speed aims to keep data warehouses responsive enough for agents that continually query and reason over live data. Beyond the public cloud, Azure Local and Foundry Local bring NVIDIA GPUs and Nemotron models closer to where data is generated. This allows the same agent orchestration patterns used in Azure to run near data sources or in sensitive environments. Developers can design workflows that span Windows PCs, local GPU servers and Azure regions, while keeping a single control plane for identity, governance and runtime policy across all their agent endpoints.

Democratizing Agentic AI Deployment for Developers and Enterprises

Together, these pieces form a path for agentic AI deployment that starts on a Windows laptop and extends through edge nodes to large AI factories in the cloud. Developers can experiment with Windows AI agents on RTX Spark hardware, move the same code to DGX Station for Windows for heavier training or inference, and then scale to Microsoft Foundry and Fabric without changing runtime semantics. For hardware builders and enterprise engineering teams, Windows shifts from being only an operating system to acting as a managed endpoint in a wider hybrid AI infrastructure. Policy controls from OpenShell and Microsoft Execution Containers stay consistent, while CUDA, TensorRT and Azure services provide a common performance layer. The result is a unified enterprise AI stack that is intended to reduce fragmentation, answer developer demand for seamless agent workflows across heterogeneous environments and make agentic AI a practical part of everyday applications rather than a separate experiment.