From Static Data Centers to Tunable AI Factory Infrastructure
NVIDIA DSX is a full-stack AI factory infrastructure platform that treats the entire data center—chips, networking, facilities, and software—as a tunable operating system for deploying and optimizing large language model (LLM) serving at scale. Instead of treating servers, power, and cooling as fixed constraints, DSX aligns energy, hardware, infrastructure, models, and applications around shared designs and control software so operators can adjust configurations like they would system settings. This matters because LLM serving is no longer a single software problem but a multi-layer engineering challenge where power budgets, cooling limits, and scheduler behavior directly affect token output and cost. By combining configurable reference designs, modular software, and AI infrastructure simulation, DSX aims to shorten time to production, improve tokens per watt, and make AI factories behave more like programmable systems than one-off custom builds.
Why LLM Serving Optimization Needs an Infrastructure ‘OS’
Modern LLM serving optimization is hard because every deployment is a web of interacting variables. Teams must choose a model backend, tensor-parallel shape, prefill and decode split, worker counts, scheduler settings, routing policies, KV cache behavior, autoscaling thresholds, and cluster topology, and a change in one layer often moves the bottleneck somewhere else. DynoSim, a discrete-event simulator for the NVIDIA Dynamo serving stack, shows how complex this becomes: it models router decisions, planner actions, KV transfers, and forward-pass timing on a shared virtual timeline to test serving strategies without burning GPU hours. According to NVIDIA, DynoSim can replay a 60.1-minute serving trace in about 2.41 seconds on a laptop, roughly 1,500x faster than real time. DSX builds on this mindset, turning these scattered knobs into something closer to a coordinated operating system for LLM serving optimization.

Inside NVIDIA DSX Platform and DSX OS
The NVIDIA DSX platform combines accelerated computing, open source libraries, APIs, reference designs, and partner technologies into one framework for AI factory infrastructure. It covers compute, networking, storage, facility design, power, cooling, controls, simulation, and operations under a co-designed architecture. DSX OS sits on top as open, modular software built for operating multi-tenant AI factories and integrating with existing tools. It coordinates chips, systems, building management controls, cooling equipment, power distribution, grid connections, and AI services so they behave like components of a single operating system. The goal is to increase tokens per watt, reduce token cost, and raise reliability as factories scale. Jensen Huang summed up the intent: “With the DSX platform, you can simulate the entire factory before you spend a dollar, validate performance before a single rack is installed and operate with the kind of reliability that production AI demands.”

Simulation and Digital Twins: Testing AI Factories Before They Exist
Simulation is where DSX starts to look most like an operating system for physical AI factories. NVIDIA’s DynoSim gives a Dynamo twin of the inference stack, turning exhaustive configuration sweeps into a simulate-then-verify loop that finds the Pareto frontier for a given workload and hardware set before consuming GPU time. At the physical layer, partners extend this approach through AI infrastructure simulation. Vertiv SmartRun, integrated into NVIDIA Omniverse DSX Blueprint, appears as a configurable digital twin of overhead physical infrastructure. Operators can design, simulate, and validate power, cooling, and controls as a single system before build-out, reducing late-stage design changes and integration risk. This model-based workflow helps align each new generation of accelerated hardware with ready infrastructure, preserving engineering intent from early planning through deployment and later optimization, instead of relying on slow, document-heavy handoffs between teams.
From Reference Blueprints to Autonomous Factory Operations
NVIDIA positions DSX as more than a set of tools: it is a playbook for AI factory design and autonomous operations. Factory Operations Blueprints within the DSX ecosystem provide reference designs for how compute clusters, facilities, and DSX OS components should be wired together and managed over their lifecycle. At the energy and cooling layer, DSX MaxLPS focuses on maximizing token performance per megawatt within a fixed power envelope by pairing 45-degrees-Celsius liquid cooling with in-rack performance-per-watt optimizations, enabling up to 40% more GPUs to run at efficient operating points with limited impact on workload performance. When combined with DSX OS, serving simulators, and digital twins, these blueprints nudge AI factories toward self-optimizing behavior: workloads inform configuration changes, infrastructure models validate them, and the operating software applies them, turning what used to be manual capacity engineering into a tunable, software-driven control loop.






