NVIDIA DSX platform for AI factories explained

What the NVIDIA DSX platform is and why it matters

The NVIDIA DSX platform is a full-stack framework that standardizes how enterprises design, simulate, deploy, and operate AI factory infrastructure, unifying chips, systems, software, facilities, and partner technologies into one coordinated architecture for scalable token generation. Instead of treating GPUs, networking, storage, power, and control systems as separate projects, DSX organizes them as layers of a single AI factory. NVIDIA describes this as a “complete playbook to build AI factories,” covering everything from energy and cooling to models and applications. At the core, DSX aligns AI workloads with available power so operators convert watts into as many tokens as possible while improving reliability and resiliency. This approach aims to cut time-to-first-production and reduce token cost by turning fragmented infrastructure decisions into a repeatable, simulated, and validated process before any rack is installed.

NVIDIA DSX: The Operating System for Autonomous AI Factories

DSX OS: Open, modular software for AI factory operations

DSX OS is the software heart of the NVIDIA DSX platform, providing open, modular components for operating multi-tenant AI factories at scale. It is designed to coordinate the full ecosystem: accelerated computing platforms, data center systems, building controls, cooling and power infrastructure, and AI services running on top. DSX OS lets infrastructure builders assemble a factory-specific “operating system” using libraries, APIs, and partner technologies that plug into a co-designed architecture rather than a monolithic stack. This modularity helps teams standardize how they manage scheduling, observability, automation, and resiliency while still customizing deployments for different workloads or facilities. By focusing on tokens per watt and lowering token cost, DSX OS turns operational concerns—such as power caps, multi-tenant isolation, or grid constraints—into software-defined policies that can be tested in simulation, rolled out gradually, and tuned without ripping up underlying hardware investments.

LLM serving optimization and the role of DynoSim

Modern LLM serving optimization is a key target for AI factory infrastructure, and NVIDIA pairs DSX with tools like DynoSim to manage its complexity. Each deployment must balance model backend, tensor-parallel shapes, prefill and decode splits, worker counts, scheduler settings, routing policies, KV cache behavior, autoscaling thresholds, and topology. These choices interact across layers, so a change that speeds up one step can slow another. DynoSim addresses this by simulating the NVIDIA Dynamo serving stack as a discrete-event system that runs on a virtual clock. Instead of burning GPU hours on every candidate configuration, operators can run thousands of scenarios offline, map the Pareto frontier, and then only validate the most promising options in hardware. On a real-world trace, DynoSim has simulated about 60 minutes of serving in a few seconds of wall time, turning LLM tuning into a fast simulate-then-verify loop.

From AI infrastructure to autonomous factory operations

NVIDIA extends the DSX vision beyond data centers with Factory Operations Blueprint, codenamed FOX, a reference design for autonomous factory operations. Traditional plants juggle PLCs, SCADA, MES, and ERP systems that rarely integrate cleanly, which leaves production intelligence fragmented and limits the impact of AI. FOX introduces a unified decision layer that ingests live machine signals, quality control data, and operational alerts into a central AI model. This creates a feedback loop between digital simulation and physical operations, so planners can test changes virtually before they reach the line. The blueprint shows how DSX’s AI factory principles—simulation-first design, modular software, and coordinated control—translate into real-time optimization on the shop floor. The result is a path from task-level automation toward plant-wide intelligence, where maintenance, quality, and throughput are managed continuously by autonomous systems guided by AI.

Scaling AI factories with DSX MaxLPS and modular architecture

To help AI factories scale under power and space constraints, DSX includes DSX MaxLPS, a suite focused on maximizing token performance per megawatt. It combines 45-degrees-Celsius liquid cooling with in-rack optimizations so operators can run up to 40% more GPUs at their most energy-efficient point with limited impact on workloads. This power-aware layer ties directly into DSX OS policies and facility controls, so infrastructure builders can plan capacity in terms of tokens per watt instead of raw hardware counts. Because the DSX platform is open and modular, teams can mix NVIDIA software, reference designs, and partner systems to fit existing data centers or greenfield builds. The aim is to treat AI factories as repeatable, composable systems: design them in software, stress them in simulation, then deploy them as modular blocks that can be scaled, upgraded, or reconfigured as demand for token generation accelerates.