Microsoft MAI models expand the AI model foundry

What the Microsoft MAI Models Are and Why They Matter

Microsoft MAI models are a new in-house family of large-scale AI systems for reasoning, coding, image generation, voice, and transcription that developers can test in Microsoft’s AI model foundry with private preview access to tune and integrate them into customized, multimodal applications. At Build, Microsoft framed this launch as a move from depending on partner frontier models toward owning more of the underlying AI stack. MAI-Thinking-1 is the flagship reasoning model and is positioned as matching the performance of Anthropic’s Claude Opus 4.6 on key software engineering benchmarks. These systems are the same engines behind Copilot, Bing, PowerPoint, GitHub Copilot, and Azure Speech, now exposed through Foundry so teams can evaluate Microsoft-owned options alongside OpenAI, Anthropic, and Google models. For developers, that means a single place to compare multimodal AI tools, apply governance, and shape models more directly than prompt engineering alone.

Microsoft’s New MAI Models Bring Private Multimodal AI to Developers

MAI-Thinking-1: Long-Context Reasoning for Complex Workflows

MAI-Thinking-1 sits at the center of the Foundry push as Microsoft’s in-house reasoning engine. It uses a sparse Mixture-of-Experts design, activating only selected expert subnetworks per request to balance capacity and compute. Microsoft lists 35 billion active parameters, roughly 1 trillion total parameters, and a 256K-token context window, plus function calling and compatibility with the Chat Completions API. That long context allows a single prompt to include large codebases, full contracts, or extended instructions without fragmenting the task. According to Microsoft AI, “For the first time developers will be able to tune the weights of the model themselves,” giving teams deeper adaptation than prompt tweaks or retrieval alone. MAI-Thinking-1 is aimed at chained reasoning tasks, long-context analysis, and code-heavy workflows where traceability and model behavior control matter as much as raw benchmark scores.

Code Generation Models Turn ‘Vibe Coding’ into a Platform Feature

Beyond reasoning, the MAI family brings dedicated code generation models into the AI model foundry for the first time. MAI-Code-1-Flash, a 5 billion parameter coding system, converts written descriptions into working source code for applications and websites, a process Microsoft has described as “vibe coding.” MAI-Code-1 and MAI-Code-1-Flash are integrated into GitHub Copilot, Visual Studio Code, and the broader Microsoft developer stack, but Foundry access lets teams treat them as configurable components rather than fixed assistants. Developers working on code generation models can now compare MAI-Code with OpenAI or Anthropic coding tools under a single governance and deployment layer. With MAI-Thinking-1 able to handle long-context reasoning over entire repositories and MAI-Code models handling snippet creation and refactoring, teams can begin to design end-to-end, reasoning-heavy development workflows inside one multimodal AI platform.

Image, Voice, and Transcription Bring Full Multimodal AI Tools

The seven new Microsoft MAI models round out Foundry with image, voice, and transcription capabilities aimed at real product workflows. MAI-Image-2.5 and its Flash variant add text-to-image generation and editing, with control-with-preservation features for adjusting style or elements without losing the core layout. Microsoft says MAI-Image-2.5 now ranks third among image generation model families in one public Arena ranking. MAI-Voice-2 is a multilingual text-to-speech and voice cloning model spanning more than 15 languages, while MAI-Transcribe-1.5 supports 43 languages and is presented as delivering up to five-times-faster transcription with better domain-specific terminology handling. Together, these multimodal AI tools let developers build apps that move from text prompts to images, synthetic voices, and transcripts within a single Microsoft-owned model family, rather than stitching together disconnected third-party services.

Private Developer Preview and the Future of the AI Model Foundry

All seven MAI systems enter Foundry in a staged rollout, with MAI-Thinking-1 in private preview for Foundry customers and a public MAI Playground preview planned later. Developers and enterprise teams gain early developer preview access to reasoning, code, image, voice, and transcription models under the same governance controls already used for OpenAI, Anthropic, and Google options. This turns Foundry into more than a catalog: it becomes Microsoft’s decision hub for comparing in-house and partner models by performance, latency, and integration fit. Microsoft says the models are trained on commercially licensed data without distillation from third-party systems, a stance aimed at customers that weigh IP and provenance concerns. As MAI models expand to platforms like OpenRouter, Fireworks, and Baseten, Foundry’s role may shift from a Microsoft-only destination into one part of a broader, multi-provider AI stack that still offers deeper tuning of Microsoft-owned models.