Agentic AI deployment with NVIDIA and Microsoft

What a Unified Agentic AI Stack Means

A unified stack for agentic AI deployment is an end-to-end platform where models, runtimes, data services and hardware across PCs, edge servers and cloud environments share consistent tools, security controls and management so that AI agents can move between them without code rewrites or workflow changes. NVIDIA and Microsoft are building exactly this for Windows and Azure, aiming to remove the fragmentation that usually separates local development from cloud-scale deployment. The stack links RTX Spark PCs, DGX Station for Windows, Azure Local integration, Microsoft Foundry and GPU-accelerated Microsoft Fabric into one environment for AI agents cloud edge developers can target. Instead of choosing separate ecosystems for personal agents, on-premises inference or hosted services, teams can treat Windows as a managed endpoint that plugs into the same identity, governance and data layer used in Azure.

NVIDIA and Microsoft Unite a Single Stack for AI Agents from PC to Cloud

RTX Spark PCs and DGX Station Turn Windows into an AI Agent Platform

At the device level, the NVIDIA Microsoft partnership centers on reimagining Windows machines as first-class hosts for agents. RTX Spark PCs are the first Windows systems purpose-built for personal agents, offering 1 petaflop of AI performance and up to 128 GB of unified memory for local inference, coding copilots and multimodal assistants. According to NVIDIA, RTX Spark systems from Microsoft Surface, ASUS, Dell, HP, Lenovo and MSI will ship with full AI and graphics performance even when unplugged. For enterprise desks, DGX Station for Windows extends the same agentic AI deployment model, using the GB300 Grace Blackwell Ultra Desktop Superchip with up to 748 GB of coherent memory and 20 petaflops of FP4 performance. This lets teams run models approaching 1 trillion parameters on-premises while still benefiting from Windows security, device management and Linux AI toolchains via Windows Subsystem for Linux.

OpenShell, Execution Containers and Secure Agent Runtimes

Hardware alone cannot make agentic AI safe or scalable, so NVIDIA and Microsoft are adding a shared control layer for how agents run. OpenShell, NVIDIA’s secure-by-design runtime for autonomous agents, is being brought to Windows on top of Microsoft Execution Containers, giving enterprises policy-driven control over what an agent can access at runtime. This means developers can define which data sources, tools or APIs each agent is allowed to touch, and enforce those rules consistently on RTX Spark PCs, DGX Station for Windows and Azure-hosted workloads. The same OpenShell runtime also underpins agents exposed through GitHub Copilot and Foundry services, helping teams reuse governance policies as agents move from experiments on laptops to always-on services in the cloud. In practice, this reduces the need for separate security reviews or containerization approaches when shifting deployments between local and remote environments.

From Local Machines to Azure Local, Foundry and Fabric

Beyond Windows devices, the unified stack stretches into Azure Local and Microsoft Foundry so developers can run the same agents across edge and cloud infrastructure. Enterprises can start by prototyping an agent on RTX Spark or DGX Station for Windows, then promote it to Azure Local integration when they need on-premises clusters managed through Azure, or to Foundry Agent Service for hosted, identity-aware deployments. NVIDIA’s open models, including Nemotron 3 Ultra for long-running reasoning and Nemotron 3.5 ASR and Content Safety, are available on Foundry managed compute and can be combined with Anthropic and OpenAI models. At the data layer, Microsoft Fabric Data Warehouse now includes NVIDIA accelerated computing, which Microsoft reports delivers up to 6x faster SQL execution than CPU baselines for high-concurrency workloads. This performance helps agents continuously query and reason over large enterprise datasets without bottlenecks.

Why This Matters for Enterprise AI Agent Adoption

The integrated NVIDIA–Microsoft stack is less about any single product and more about removing friction in agentic AI deployment. Developers no longer need to maintain separate toolchains for local RTX Spark PCs, on-prem DGX Station systems and Azure cloud services; Windows becomes a consistent endpoint that connects to the same agent runtimes, identity controls and data services as Azure Local and Foundry. This flexibility allows teams to run cost-sensitive or latency-critical workloads locally while scaling spiky or compute-heavy tasks to cloud infrastructure. It also brings physical AI and autonomous systems into the same environment, as NVIDIA’s Cosmos 3 and physical AI skills integrate with Azure’s Physical AI Toolchain. For enterprises, the result is a clearer path from prototype agents on a developer laptop to production-grade, governed agents operating across PCs, edge hardware and AI factories, accelerating mainstream adoption.