MilikMilik

NVIDIA’s DSX OS Turns AI Factories into Software-Defined Infrastructure

NVIDIA’s DSX OS Turns AI Factories into Software-Defined Infrastructure
Interest|High-Quality Software

From Data Centers to AI Factories: Defining NVIDIA DSX OS

NVIDIA DSX OS is an AI factory operating system that turns complex, hardware-centric AI infrastructure into an open, modular, software-defined platform for designing, simulating, deploying, and running large-scale token-generating workloads efficiently. Instead of treating chips, racks, power, and cooling as isolated projects, the NVIDIA DSX platform presents them as a single, coordinated AI factory stack that spans energy, compute, infrastructure, models, and applications. DSX brings together open source software libraries, APIs, reference designs, accelerated computing systems, and partner technologies so infrastructure teams can build AI factories with a common architecture. NVIDIA positions DSX as a full-stack AI infrastructure software approach that covers compute, networking, storage, facilities, power, cooling, controls, simulation, and operations. This turns AI capacity planning into a software problem that can be simulated and validated before hardware is ordered, helping enterprises move from bespoke data centers to repeatable AI factory designs.

NVIDIA’s DSX OS Turns AI Factories into Software-Defined Infrastructure

Inside DSX OS: An Open, Modular AI Factory Operating System

At the core of the NVIDIA DSX platform, DSX OS provides open source, modular software components designed specifically for multi-tenant AI factories. The system is built to coordinate a complex ecosystem: GPUs and servers, facility systems such as power distribution and cooling, grid behavior, and the AI platforms and services that sit on top. According to NVIDIA, these DSX OS components, now derived from the company’s own DGX Cloud operations, give partners an off-the-shelf software foundation instead of months of custom development. The aim is to improve tokens per watt, reduce token cost, and make large-scale AI infrastructure more reliable and resilient under continuous workloads. Because the architecture is extensible, enterprises can integrate DSX OS into existing AI infrastructure software and operations tools, gradually shifting to a software-defined AI factory model rather than replacing everything at once.

Maximizing Tokens per Watt: DSX MaxLPS and Power-Aware Design

Power has become the hard limit for AI factories, so NVIDIA pairs DSX OS with DSX MaxLPS to raise output within fixed energy budgets. DSX MaxLPS combines 45-degrees-Celsius liquid cooling and in-rack performance-per-watt technologies, allowing operators to run up to 40% more GPUs at their most energy-efficient point with minimal impact on workload performance. In this model, AI infrastructure software is not only scheduling jobs; it is actively managing how available power converts into token generation capacity. DSX ties grid behavior, facility controls, and compute policies into one coordinated control plane instead of leaving power as a separate facilities concern. The result is an AI factory operating system that optimizes tokens per watt across the full stack, from energy and cooling through chips and systems up to the model and application layers, turning energy constraints into a software-optimized variable.

Digital Twins with Vertiv SmartRun: Simulating AI Factories Before Build-Out

NVIDIA’s DSX platform extends beyond software orchestration to simulation, using digital twins to validate AI factories before any physical build-out. Vertiv’s SmartRun overhead converged physical infrastructure system is integrated as a configurable digital twin inside the NVIDIA Omniverse DSX Blueprint. This lets teams design, simulate, and validate power, cooling, controls, and rack infrastructure as a single system rather than through document-based handoffs. By capturing dependencies and configurations in a model, operators can test AI infrastructure behavior, reduce late-stage design changes, and lower integration risks. Vertiv describes this as the first phase of a broader AI factory digital twin roadmap aimed at keeping physical infrastructure in step with accelerated compute innovation. For enterprises, this means the NVIDIA DSX platform evolves into a lifecycle tool: from early configuration and simulation, through deployment and commissioning, to ongoing optimisation of AI factory operations.

Factory Operations Blueprint: Toward Autonomous Factory Operations

Beyond data center-scale AI factories, NVIDIA is applying the same software-defined idea to physical manufacturing with its Factory Operations Blueprint, codenamed FOX. Rather than a single product, FOX is a reference design for autonomous factory operations that unifies fragmented systems such as PLCs, SCADA, MES, and ERP into a common decision layer. The blueprint specifies how to ingest signals from legacy equipment and modern IoT sensors, feed them into centralized AI models, and close the loop between digital simulation and real-world production. NVIDIA Metropolis provides vision AI for quality inspection, while the overall architecture enables plant-wide intelligence instead of isolated automation cells. By publishing FOX as a reference design, NVIDIA gives integrators and internal teams a starting pattern for autonomous factory systems, reducing deployment complexity and shortening time-to-market for AI-driven production lines built on the NVIDIA DSX platform.

NVIDIA’s DSX OS Turns AI Factories into Software-Defined Infrastructure

Milik earns a commission when you shop through our links, at no extra cost to you. Editorial content is independently selected by our team.

You May Also Like

Comments
Say something...
No comments yet. Be the first to share your thoughts!