What NVIDIA DSX Is and Why AI Factories Need It
NVIDIA DSX is a full-stack AI software and infrastructure platform that unifies chips, systems, facilities, and partner technologies so organizations can design, simulate, build, and operate AI factories that generate intelligence tokens more efficiently at scale. Instead of treating compute, networking, storage, power, and cooling as separate projects, the NVIDIA DSX platform presents a single architecture that links these layers into one AI factory infrastructure. This matters because AI is now treated as core infrastructure, and demand for large-scale models pushes data centers toward gigawatt-scale deployments. DSX aligns the five-layer stack of energy, chips, infrastructure, models, and applications, giving infrastructure builders a consistent way to plan and run their environments. As Jensen Huang explains, “We’re not just shipping chips — we’re giving every infrastructure builder a complete playbook to build AI factories.”

A Full-Stack AI Factory Platform: From Design to Operations
The NVIDIA DSX platform integrates software, reference designs, simulation, and partner systems into one coherent framework for AI factory infrastructure. DSX Reference Design provides generation-specific, validated architectures that cover compute, networking, storage, cluster layout, and facility infrastructure. This lets teams start from tested blueprints rather than guesswork. DSX Sim adds a high-fidelity simulation layer so operators can model power, cooling, and workload behavior before installing hardware. According to NVIDIA, with DSX “you can simulate the entire factory before you spend a dollar” and validate performance before a single rack arrives. These capabilities support faster planning cycles, shorter time to first production, and better predictability at scale. By tying simulation, design, and operations together, DSX moves AI infrastructure scaling from trial-and-error toward repeatable, template-driven deployment.
DSX OS and MaxLPS: Operating AI Factories for Tokens per Watt
At the operational layer, DSX OS and DSX MaxLPS bring open, modular tools for running AI factories efficiently. DSX OS is purpose-built full-stack AI software for multi-tenant AI factories, covering lifecycle management, intelligent scheduling, runtime consistency, health automation, resiliency, and platform services. It is open source and extensible, so operators can integrate it into existing platforms instead of rewriting their stack. DSX MaxLPS focuses on tokens per watt by combining 45-degrees-Celsius liquid cooling with in-rack optimizations. NVIDIA reports that this approach allows operators to run up to 40% more GPUs at their most energy-efficient point within a fixed power budget, with minimal impact on workload performance. Together, these components improve power efficiency, reduce token cost, and support continuous, reliable operations even as AI infrastructure scaling pushes toward gigawatt-size deployments.
Coordinating Power, Facilities, and AI Workloads
NVIDIA DSX treats power and facilities as integral parts of AI factory infrastructure, not side concerns. DSX Flex connects AI factories to power-grid services so workloads can adapt to events such as load shedding, demand response, and price signals. DSX Exchange, built around an MQTT-based communication hub, links building controls, thermal data, power distribution, and IT systems. This gives software components like DSX MaxLPS, DSX OS, and partner tools a shared view of grid events, cooling conditions, and anomalies across the data center. DSX OS components also expose MCP servers that AI agents can use to discover the whole operational surface as a unified tool catalog. This agentic environment makes it possible, for example, to connect a GPU health alert to a thermal anomaly or a network issue, turning fragmented telemetry into coordinated, automated responses.
Modular Architecture for Customized AI Infrastructure Scaling
Although NVIDIA DSX is full-stack, it is built as a modular architecture so organizations can tailor deployments to their needs. Infrastructure builders can begin with DSX Reference Design for a baseline AI factory layout, then layer in DSX Sim for planning, DSX OS for operations, and DSX MaxLPS or DSX Flex where power efficiency and grid integration matter most. Because DSX OS components are open source and extensible, partners can combine them with existing cluster managers, observability tools, or AI platforms instead of replacing everything at once. This modularity is key for AI factory infrastructure that must evolve as models, workloads, and power constraints change. By aligning chips, systems, software, facilities, and partner technologies under one shared architecture, the NVIDIA DSX platform simplifies large-scale intelligence generation while leaving enough flexibility for unique business and technical requirements.
