NVIDIA Microsoft partnership: unified stack for agentic AI

What the Unified NVIDIA–Microsoft Agentic AI Stack Is

The unified NVIDIA–Microsoft agentic AI stack is an integrated hardware, software and cloud platform that lets developers design, run and scale AI agents consistently across Windows PCs, local infrastructure and Azure services, using shared runtimes, models and management tools. Announced at Microsoft Build, it brings together RTX Spark PCs, DGX Station for Windows, Azure Local, Microsoft Foundry, Microsoft Fabric and GitHub Copilot into one Windows AI stack aimed at agentic AI deployment. This goes beyond adding GPUs to laptops: Windows becomes a managed endpoint for local agents, large‑model inference and hybrid AI infrastructure spanning edge to cloud AI. Jensen Huang joined Satya Nadella’s keynote to underline how seriously both companies treat this NVIDIA Microsoft partnership, which now links personal agents, enterprise deskside systems and Azure-hosted models into a single, coordinated environment for long‑running and autonomous workflows.

NVIDIA and Microsoft Unite a Single Stack for Agentic AI From Windows to Cloud

From RTX Spark PCs to DGX Station: Reinventing Windows Devices for Agents

At the client tier, RTX Spark PCs are the anchor for personal agents on Windows. NVIDIA describes RTX Spark as providing 1 petaflop of AI performance and up to 128GB of unified memory in laptops and small desktops from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI, with systems expected this autumn. There is also a Surface RTX Spark Dev Box with the same 128GB of unified memory and a 100W thermal envelope, aimed at developers running local model and agent workloads. On the enterprise desk, DGX Station for Windows extends this idea with the GB300 Grace Blackwell Ultra Desktop Superchip, up to 748GB of coherent memory and 20 petaflops of FP4 performance for models up to 1 trillion parameters. Both classes of system run Windows, but are designed from the ground up for agentic AI deployment rather than traditional office tasks.

Secure Runtimes and Local Control: OpenShell, Execution Containers and Azure Local

Hardware is only one piece; the stack also standardises how agents run securely. NVIDIA OpenShell, described as a secure-by-design runtime for autonomous agents, is being brought to Windows on top of Microsoft Execution Containers, a policy-driven layer that governs what an agent can access at runtime. That combination targets developers who need long‑running, semi-autonomous workflows without giving agents unchecked access to local data or networks. On the infrastructure side, Azure Local and Foundry Local connect RTX and Blackwell-based servers to the same patterns used in Azure cloud, so teams can keep sensitive workloads on-premises while tapping the same agent orchestration and monitoring. For many organisations, this means they can prototype agents on a Windows PC, validate them on DGX Station for Windows or local RTX servers, and then promote them to Azure or Foundry with minimal rework.

Foundry, Fabric and Open Models: A Unified Data and Model Plane

On the cloud side, Microsoft Foundry and Fabric supply the model and data layers behind agentic AI deployment. NVIDIA, Anthropic and OpenAI models, plus Hermes special agents, are now available through Foundry Agent Service so enterprises can build hosted agents with built-in identity and governance. NVIDIA Nemotron 3 Ultra, a new open frontier reasoning model aimed at long‑running agents for coding, research and enterprise tasks, is available on Foundry managed compute alongside Nemotron 3.5 ASR and Nemotron 3.5 Content Safety. According to NVIDIA, Microsoft Fabric Data Warehouse now runs on NVIDIA accelerated computing, delivering SQL execution up to 6x faster than an internal CPU baseline. This matters because agents often issue continuous, high-frequency queries; bringing GPUs into Fabric helps the data layer keep pace with the reasoning layer, turning Fabric into a practical backbone for edge to cloud AI workflows.

What This Means for Developer Workflows Across Edge, Local and Cloud

For developers, the unified NVIDIA–Microsoft stack promises a smoother path from prototype to production. RTX Spark PCs and DGX Station for Windows enable building and testing agents directly on Windows using CUDA, TensorRT and familiar enterprise applications, then lifting those agents into Azure, Foundry or Azure Local without redesigning runtimes. NVIDIA Agent Toolkit and NemoClaw blueprints give an open source starting point for production agents, while CUDA‑X libraries such as cuDF, cuOpt, AI‑Q and NeMo appear as domain-specific skills agents can call. Physical AI tools powered by Cosmos 3 link simulation and robotics into the same world of agents. The shift is that agentic AI is no longer confined to data-center specialists: it becomes a first-class workload for consumer and prosumer Windows devices, yet stays connected to the same management, data and governance fabric that runs in the cloud.