NVIDIA Microsoft partnership for edge to cloud AI

What the Unified NVIDIA-Microsoft Agentic AI Stack Actually Is

The unified NVIDIA-Microsoft agentic AI stack is a combined set of hardware, models, runtimes and data services that lets developers design, run and scale autonomous AI agents consistently across Windows devices, local infrastructure and Azure cloud environments using the same core technologies and tools. Announced at Microsoft Build, where NVIDIA CEO Jensen Huang joined Satya Nadella’s keynote via livestream from Taipei, the partnership ties together RTX Spark PCs, DGX Station for Windows, Azure Local integration, Microsoft Foundry and Microsoft Fabric into one end-to-end platform for agentic AI deployment. The aim is to treat Windows not only as an operating system but as a managed endpoint in a broader edge to cloud AI fabric, so that personal agents, enterprise agents and physical AI systems can share models, data access patterns and security controls without requiring a different stack for each environment.

NVIDIA and Microsoft Unite the Windows AI Stack From Edge to Cloud

From RTX Spark PCs to DGX Station: Reinventing Windows for Agentic AI

On the client side, RTX Spark PCs are the new anchor for personal agents in the Windows AI stack. These laptops and small desktops deliver 1 petaflop of AI performance and up to 128 GB of unified memory, and are expected this fall from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI. Developers also get a Surface RTX Spark Dev Box configuration with 128 GB of unified memory and a 100 W thermal envelope for local model and agent workloads. At the deskside, DGX Station for Windows pushes agentic AI into enterprise workflows. Built around the NVIDIA GB300 Grace Blackwell Ultra Desktop Superchip, it offers up to 748 GB of coherent memory and 20 petaflops of FP4 performance for frontier models up to 1 trillion parameters, while still supporting Windows management and Linux toolchains via Windows Subsystem for Linux.

OpenShell, Secured Runtimes and Azure Local Integration

A key layer in this unified Windows AI stack is NVIDIA OpenShell, a secure runtime designed for autonomous agents. Microsoft is bringing OpenShell to Windows on top of Microsoft Execution Containers, a policy-driven execution layer that controls what an agent can access at runtime. This is vital as agentic AI deployment moves from experiments to production, where agents may handle sensitive data and long-running tasks. On the infrastructure side, Azure Local integration and Foundry Local extend the same stack into customer data centers and edge locations, with NVIDIA RTX-powered servers supporting local inference and hybrid topologies. This means enterprises can run agents close to their data on-premises while still tying into Azure identity, governance and model catalogs, reducing latency and improving control without giving up the benefits of cloud-scale orchestration.

Foundry, Nemotron and Fabric: A Unified Edge to Cloud AI Data Plane

In the cloud, Microsoft Foundry and Microsoft Fabric complete the story by providing model access and a GPU-accelerated data layer on Azure. NVIDIA open models, including Nemotron 3 Ultra for long-running reasoning, Nemotron 3.5 ASR for speech recognition and Nemotron 3.5 Content Safety, are available on Foundry managed compute and can be combined with Anthropic and OpenAI models. NVIDIA’s Agent Toolkit and NemoClaw blueprints give an open source path to production agents that can call CUDA-X libraries such as cuDF, cuOpt, AI-Q and NeMo as domain-specific skills. According to NVIDIA, Microsoft Fabric Data Warehouse now uses NVIDIA accelerated computing to deliver SQL execution up to 6x faster than a CPU baseline for high-concurrency workloads, helping agents query and reason over enterprise data fast enough to support interactive, continuous decision-making.

Why a Single Windows AI Stack Matters for Developers

For developers, the significance of the NVIDIA Microsoft partnership is the promise of one coherent Windows AI stack from edge to cloud. The same agent architectures, models and security concepts can span RTX Spark client devices, DGX Station for Windows, Azure Local deployments and Azure cloud services such as Foundry and Fabric. This reduces fragmentation in agentic AI deployment and cuts the overhead of maintaining separate code paths and tooling for local and cloud environments. Windows becomes a managed endpoint for both consumer and enterprise agents, while Azure provides shared identity, governance and data services. With NVIDIA’s physical AI tools and Cosmos 3 omnimodel also integrated into Azure’s Physical AI Toolchain, the same stack can extend to autonomous systems in the real world, letting teams prototype on a laptop and graduate to local servers or AI factories without redesigning their foundations.