NVIDIA Microsoft agentic AI deployment stack

What a Unified NVIDIA–Microsoft Agentic AI Stack Means

A unified NVIDIA–Microsoft agentic AI stack is an integrated hardware, software and data platform that lets developers build, test and deploy autonomous AI agents consistently across Windows PCs, local infrastructure and cloud services, using the same runtimes, models and security controls from edge devices to large-scale enterprise environments. Announced at Microsoft Build, where Jensen Huang joined Satya Nadella’s keynote, the expanded NVIDIA Microsoft partnership targets end-to-end agentic AI deployment rather than isolated model demos. The stack spans RTX Spark Windows PCs, DGX Station for Windows, Azure Local, Microsoft Foundry and GPU-accelerated Microsoft Fabric. For developers, the promise is fewer environment-specific hacks and clearer portability paths for Windows AI agents. Instead of separate code paths for laptops, on-prem servers and Azure, teams can define agents once and decide later whether they run in the cloud, on local infrastructure or directly on users’ devices.

NVIDIA and Microsoft’s Unified AI Stack Puts Agentic AI Everywhere

Windows AI Agents from RTX Spark to DGX Station

On the client side, RTX Spark systems are the new baseline for Windows AI agents. NVIDIA describes RTX Spark as providing 1 petaflop of AI performance and up to 128 GB of unified memory, with systems arriving from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI. For dev teams, that means a realistic target spec for personal agents that run offline, stay responsive and support long-running sessions. For enterprise deskside workflows, DGX Station for Windows takes the same idea to a supercomputer-class machine. Powered by the GB300 Grace Blackwell Ultra Desktop Superchip, it offers up to 748 GB of coherent memory and 20 petaflops of FP4 performance, and can run models of up to 1 trillion parameters locally. Both platforms keep Windows in the loop for security and management, while Windows Subsystem for Linux preserves existing Linux-centric AI toolchains.

OpenShell, GitHub Copilot and Secure Agent Runtimes

Hardware alone does not define an AI development stack; agent behavior and security matter as much. NVIDIA OpenShell is the secure runtime at the heart of this partnership, designed specifically for autonomous agents. On Windows, OpenShell sits on top of Microsoft Execution Containers, a policy-driven execution layer that controls what an agent can access at runtime. According to NVIDIA’s blog, OpenShell is also present in GitHub Copilot, aligning developer tooling with production runtimes. That gives teams a path to prototype agents in familiar tools, then deploy them with the same security model on Spark PCs, DGX Station or Azure. For regulated environments, the key gain is fine-grained control over data, tools and external systems that agents can touch, without rewriting core agent logic for each environment or sacrificing long-running reasoning.

Azure Local Integration and Foundry for Agentic AI Deployment

The cloud and near-cloud pieces of the NVIDIA Microsoft partnership are Azure Local and Microsoft Foundry, which together extend the same agentic AI deployment model beyond Windows endpoints. Foundry provides hosted agents backed by NVIDIA, Anthropic and OpenAI models, with Hermes special agents and NVIDIA Nemotron 3 Ultra for long-running reasoning. Developers can mix Nemotron models with other frontier or local models to balance cost and quality. Azure Local and Foundry Local bring these capabilities into data centers and edge sites, so enterprises can run the same Windows AI agents near their data while keeping sensitive workloads off the public cloud. NVIDIA RTX PRO 6000 Blackwell Server Edition and CUDA-X libraries expose domain-specific skills, such as cuDF and cuOpt, as callable tools inside agents. The result is a consistent AI development stack from developer laptops to on-prem clusters and Azure.

Accelerated Data, Physical AI, and Practical Next Steps for Developers

Agentic AI systems depend on responsive data layers and, increasingly, physical-world awareness. Microsoft Fabric Data Warehouse now runs on NVIDIA accelerated computing, with Microsoft reporting SQL execution up to 6x faster than a CPU baseline and up to 7x faster than three other leading cloud data warehouse providers for high-concurrency workloads. That matters for agents that continuously query, summarize and act on enterprise data. On the physical AI side, Microsoft is integrating NVIDIA’s open physical AI skills and Cosmos 3 omnimodel into its Physical AI Toolchain, giving developers a path from simulated agents to deployed robots and industrial systems. For developers, the practical path forward is clear: standardize on OpenShell as the agent runtime, target RTX Spark or Surface RTX Spark Dev Box for local testing, use Foundry and Azure Local integration for scaling, and treat Windows devices, local servers and Azure as interchangeable execution targets for one agentic codebase.