From OpenAI Power Player to Human-Centered AI Founder
Mira Murati, best known for steering the development of ChatGPT as CTO at OpenAI and briefly serving as interim CEO during the company’s board turmoil, has re-emerged with a bold new vision. In February 2025, she founded Thinking Machines Lab, a startup laser-focused on human-centered AI and more natural AI collaboration tools. Rather than just building another large model, Murati wants to fix how people and machines actually work together. Thinking Machines’ early story has already been marked by intense competition for talent and technology. Meta reportedly attempted to acquire the startup in 2025 and, after being turned down, hired away seven founding members. Murati responded by recruiting PyTorch creator Soumith Chintala as CTO, underscoring her ambition to build a world-class research organization around next-generation interaction models and continuous human-AI collaboration.
Inside Thinking Machines’ 0.4-Second ‘Interaction Models’
Thinking Machines Lab has introduced what it calls “interaction models,” designed to handle conversation, video, and collaboration in real time rather than turn-by-turn. Its flagship model, TML-Interaction-Small, is engineered to respond in just 0.40 seconds while simultaneously processing audio, video, and text. That latency beats Google’s Gemini-3.1-flash-live at 0.57 seconds and OpenAI’s GPT-realtime-2.0 at 1.18 seconds, positioning the Mira Murati startup as a serious contender in low-latency, multimodal AI. Technically, these models break interactions into 200-millisecond chunks, allowing the system to listen, watch, think, and talk at once. One subsystem manages conversational flow while another handles heavier reasoning tasks in the background. In demos, Thinking Machines AI counted exercise reps from video, translated speech in real time, and even noticed posture changes—all while maintaining a fluid dialogue, mirroring how humans juggle perception and conversation concurrently.
Solving the ‘Bandwidth Bottleneck’ Between People and AI
Murati’s team argues that the biggest limitation in current AI assistants is not raw intelligence, but interaction design. Today’s systems typically wait for you to finish typing or speaking, process the entire input, and then reply—a stop-start pattern that caps how much of your intent and context reaches the model. Thinking Machines calls this the “bandwidth bottleneck” between humans and AI. Interaction models try to remove that bottleneck by keeping the system always-on and mid-conversation. Instead of treating each prompt as a separate transaction, the AI continuously ingests speech, video, and text, adjusting its responses as new information arrives. This supports more nuanced, back-and-forth collaboration that feels closer to working with another person than querying a tool. The goal is not just faster answers, but richer, higher-throughput exchanges where human intent can be expressed more fully and acted on more quickly.
A New Model for AI Collaboration Tools at Work and Home
If Thinking Machines can deliver on its research preview, the implications for enterprise automation and personal productivity could be significant. In workplaces, a continuously attentive, 0.4-second AI could shadow meetings, track action items, monitor dashboards, and respond to voice instructions without interrupting the flow of discussion. That opens the door to human-centered AI workflows where teams treat the system like a real-time collaborator rather than a delayed-response assistant. For individuals, interaction models hint at productivity tools that feel more like the AI companion imagined in the film “Her”—an assistant that can watch your screen, listen as you talk, and proactively help as tasks evolve. While the technology is not yet publicly available, Thinking Machines plans limited access for research partners in the coming months and a broader launch later. For a 15-month-old startup already reportedly valued in the tens of billions, this marks a defining first look at its ambitions.
