NVIDIA Microsoft partnership and the agentic AI stack

What Agentic AI Deployment Means in the New NVIDIA–Microsoft Stack

Agentic AI deployment is the process of building, running and scaling AI agents that can plan, reason and execute complex workflows autonomously across personal devices, local infrastructure and cloud platforms with consistent security, governance and performance. In the new NVIDIA–Microsoft partnership, that idea turns into a concrete enterprise AI stack. Windows is recast as a primary endpoint for agents, from RTX Spark laptops to DGX Station for Windows deskside systems. These endpoints connect into Azure Local, Microsoft Foundry and Microsoft Fabric, forming a single, accelerated path from prototype to production. According to NVIDIA, the goal is to let teams “build, run and scale agents across Windows devices, Azure cloud services and local deployments” without rewriting code or rebuilding pipelines. For developers and IT leaders, the shift is less about a new PC category and more about treating every Windows machine as part of a unified, managed AI fabric.

NVIDIA and Microsoft Tie Windows to a Unified Agentic AI Stack

From RTX Spark PCs to DGX Station: Windows as Agentic AI Workbench

On the client side, RTX Spark PCs are positioned as Windows machines purpose-built for personal agents and local AI workloads. They deliver up to 1 petaflop of AI performance and as much as 128 GB of unified memory, with systems expected from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI. These laptops and small desktops run NVIDIA OpenShell so developers can build and test agent workflows with full GPU acceleration, CUDA, RTX, DLSS and TensorRT on the same device they use daily. At the higher end, DGX Station for Windows brings the same agentic AI deployment model to deskside “AI supercomputers”. Based on the GB300 Grace Blackwell Ultra Desktop Superchip, DGX Station for Windows offers up to 748 GB of coherent memory and 20 petaflops of FP4 performance for always-on enterprise agents and trillion-parameter models, while still integrating with Windows management and Linux tools via WSL.

Secure Runtimes and Foundry: Turning Agents into Enterprise Systems

Hardware is only one layer of the enterprise AI stack. The partnership also focuses on secure execution and model orchestration so agents behave like governed enterprise systems, not ad hoc scripts. NVIDIA OpenShell, described as a secure-by-design runtime for autonomous agents, is coming to Windows on top of Microsoft Execution Containers, which control what an agent can access at runtime. In the cloud, Microsoft Foundry and its Foundry Agent Service host NVIDIA, Anthropic and OpenAI models alongside Hermes special agents, with identity and governance built in. Nemotron 3 Ultra, a new open reasoning model tuned for long-running agents, arrives on Foundry managed compute together with Nemotron 3.5 ASR and Content Safety. Developers can compose these models with local ones, balancing cost and quality. This combination turns agentic AI deployment into a multi-model, policy-aware platform for coding, research and line-of-business workflows.

Fabric, Azure Local and the Enterprise AI Stack Beyond the Cloud

Data and locality are critical as agents move from demos to production workflows. NVIDIA GPU acceleration is now built into Microsoft Fabric Data Warehouse, where Microsoft’s internal benchmarks show SQL running up to 6x faster than a CPU-only baseline and up to 7x faster than three other leading cloud warehouses under high concurrency. That helps AI agents continuously query and reason over enterprise data without hitting latency bottlenecks. At the edge and on-premises, Azure Local and Foundry Local gain support for NVIDIA RTX-class server GPUs and Nemotron models, extending the same agentic AI deployment stack into data centers and regulated environments. For physical AI, Microsoft is integrating NVIDIA’s open physical AI skills and Cosmos 3 models into its Physical AI Toolchain, giving developers a unified path to simulate and deploy robots, industrial systems and other autonomous machines that must perceive, plan and act in the real world.

Implications for Developers and Enterprise Workflows

For developers, the unified NVIDIA–Microsoft agentic AI stack means Windows becomes both the starting point and a stable target for deployment. A workflow can begin on a Surface RTX Spark Dev Box, scale up on DGX Station for Windows, then move into Foundry or Azure Local with the same NVIDIA CUDA-X libraries and OpenShell runtime semantics. Enterprise teams gain a clearer architecture: Windows devices become managed agent endpoints, Fabric holds the accelerated data layer, Foundry orchestrates multi-model agents and Azure Local or on-prem hardware handle low-latency or compliant workloads. The practical change is that agents can be treated as long-running services wired into identity, security policies and data governance, rather than isolated bots. As agentic AI deployment matures, this stack positions Windows as a central platform where personal, enterprise and physical agents share a common foundation instead of living in separate silos.