Stable Audio 3.0 Brings Six-Minute AI Songs and O...

Four-Model Lineup: From On-Device Sketches to Studio-Scale AI Music Generation

Stable Audio 3.0 arrives as a four-model family designed to cover everything from sound effects to full-length tracks. The lineup includes a Small SFX model, a Small model, a Medium model, and a Large model, spanning roughly 459 million to 2.7 billion parameters. The three smaller models ship as open-weight models, meaning developers can download, inspect, and customize them, while the Large model is reserved for API access or paid self-hosting. Under the hood, a new semantic-acoustic autoencoder paired with latent diffusion enables variable-length generation, flexible editing, and more stable musical structure over time. Small SFX targets sound design on consumer hardware, Small focuses on short-form composition on phones and laptops, Medium pushes toward longer and more refined music pieces, and Large is tuned for hosted services that need higher throughput and tighter latency control in production environments.

Stable Audio 3.0 Brings Six-Minute AI Songs and Open-Weight Models to Professional Workflows

Six-Minute Songs: Why Longer Generation Matters for Professional Music Workflows

The standout upgrade in Stable Audio 3.0 is duration. The Small model now generates up to two minutes of audio, while the Medium and Large models reach six minutes and 20 seconds. That leap more than doubles the length possible in earlier Stable Audio releases and turns AI music generation from a sketching tool into a full-song composer. For producers, this unlocks single-pass generation of complete arrangements: intros, verses, choruses, breakdowns, and outros can all be carried in one coherent output instead of being stitched together from fragments. Because generation length is adjustable down to the second, creators can target precise runtimes for cues, trailers, or streaming-ready tracks. The models also support audio inpainting and extension, allowing users to rework sections or lengthen a piece beyond its original endpoint without restarting, which aligns better with iterative studio workflows.

Open-Weight Models, LoRa Tuning, and Local Deployment for Developers

By releasing three of the four models with open weights, Stability AI shifts AI music generation toward a more flexible, developer-friendly ecosystem. Small SFX, Small, and Medium can be downloaded, run locally, and modified, reducing dependence on any single vendor’s cloud. The Small model is light enough to write complete songs directly on phones or laptops, enabling offline AI music generation instead of brief online previews. For customization, Stable Audio 3.0 supports LoRa training on the Small and Medium models, so teams can adapt them to a proprietary sound library or brand identity without retraining from scratch. This combination—open-weight models, fine-tuning hooks, and local deployment—makes it easier to embed AI music generation into DAWs, game engines, or creator platforms while retaining control over latency, privacy, and integration details.

Licensed Training Data and Enterprise Licensing for Commercial Use

Stable Audio 3.0’s training data strategy directly targets the copyright disputes surrounding earlier AI music tools. Stability AI says the models were trained on licensed and Creative Commons material, including AudioSparx and Freesound, with filtering to remove unauthorized copyrighted music. The company also points to label agreements and positions Stable Audio 3.0 as a safer choice than models trained on unlicensed catalogs. Licensing is split between the Stability AI Community License and an Enterprise License. Under the community terms, users retain ownership of their outputs and can commercialize them, while organizations with more than $1 million in annual revenue are required to obtain an enterprise license to use the models commercially. For larger companies, the enterprise route also provides a clearer commercial-rights path and legal indemnification, which is crucial for integrating AI music generation into consumer-facing products at scale.

Democratizing AI Music Generation and Reducing Vendor Lock-In

In a field increasingly dominated by fully proprietary services, Stable Audio 3.0’s open-weight models are a strategic counterpoint. Developers and smaller creators gain the ability to host models themselves, tune them to niche genres, or build custom interfaces without being tied to a single cloud API or pricing model. For startups and creator platforms, this lowers switching costs: they can prototype locally with Small or Medium, then decide whether to move heavy workloads to Stability’s Large model via API or keep everything in-house. The clearer licensing story around training data helps differentiate Stable Audio from rivals facing lawsuits, and the focus on open weights signals a long-term bet on an ecosystem rather than a closed service. Together, six-minute songs, flexible deployment, and open weights position Stable Audio 3.0 as both a creative tool and an infrastructure layer for the next wave of AI music products.

Stable Audio 3.0 Brings Six-Minute AI Songs and Open-Weight Models to Professional Workflows

Four-Model Lineup: From On-Device Sketches to Studio-Scale AI Music Generation

Six-Minute Songs: Why Longer Generation Matters for Professional Music Workflows

Open-Weight Models, LoRa Tuning, and Local Deployment for Developers

Licensed Training Data and Enterprise Licensing for Commercial Use

Democratizing AI Music Generation and Reducing Vendor Lock-In