What the Siri AI upgrade means and why Apple changed course
Apple’s redesigned Siri is a full rebuild of the voice assistant into a modern generative AI system that combines on‑device processing with cloud‑based large language models, shifting some intelligence from Apple’s own infrastructure to Google Gemini running on Nvidia chips in order to improve speed, capability, and context awareness for users. The upgrade, expected to arrive as a beta with iOS 27 around September 2026, marks Apple’s most ambitious Siri AI upgrade to date. Reports indicate Apple initially built proprietary AI servers for Siri, tied to its Private Cloud Compute platform and Apple Silicon. But when the company tested running Gemini‑scale models, Google’s existing cloud setup delivered faster results than Apple’s in‑house hardware. That performance gap pushed Apple toward a hybrid model: simple tasks handled locally, and complex, multi‑step Siri requests sent to the cloud for processing on Google Cloud Nvidia chips.

Inside the new hybrid Siri stack: Apple devices, Google Cloud, Nvidia Blackwell
The Apple Siri redesign centers on a hybrid architecture that splits work between the device and the cloud. Everyday requests such as setting timers, toggling settings, or fetching simple facts will run on Apple devices using on‑device Apple Intelligence models. For heavier jobs—summarizing documents, answering layered questions, or coordinating actions across apps—Siri will send encrypted prompts to Google Cloud, where Google’s Gemini models run on Nvidia’s Blackwell B200 GPUs. According to The Information, Apple is paying around USD 1 billion (approx. RM4.6 billion) per year for access to a customized Gemini model with about 1.2 trillion parameters, far beyond Apple’s estimated 150‑billion‑parameter in‑house cloud models. Nvidia’s Blackwell chips, supplied through Google’s infrastructure, deliver higher memory bandwidth and faster inference than prior generations, giving the upgraded Siri the headroom it needs to compete with today’s chatbot‑style assistants.
Why Apple abandoned its own AI servers for Google and Nvidia
Apple has a long history of vertical integration, building its own chips, software, and services. For Siri’s AI comeback, though, that model hit a wall. Apple reportedly tried to run Gemini‑class models on its Private Cloud Compute architecture, powered by Apple Silicon, but real‑world tests showed Google’s infrastructure could return answers faster. With competitors already shipping advanced assistants, Apple faced a choice: spend more time scaling its own AI servers, or rent capacity from a player that had already solved the problem. The company chose the latter, routing some Siri queries through Google Cloud Nvidia chips so it could match or beat rival performance sooner. This move does not scrap Apple’s in‑house work—it continues to run on‑device models and smaller cloud systems—but it signals a tactical shift: Apple is willing to outsource part of the stack when the performance gap is too large to ignore.
Performance gains vs Apple Intelligence privacy promises
The new setup puts performance and privacy into tension. On one hand, Google Cloud Nvidia chips allow Siri to tap into large, capable models without slowing to a crawl. On the other, Apple had publicly highlighted that Apple Intelligence requests would be processed only on Apple servers, supported by its Private Cloud Compute system. Using Google Cloud complicates that promise and raises Apple Intelligence privacy questions. To limit risk, Apple will keep simpler Siri tasks on‑device and use Nvidia’s confidential computing features in the cloud, which keep data encrypted even while it is processed. Apple already routes Apple Intelligence prompts through Private Cloud Compute with strict rules that prevent retention for training, and reports suggest similar safeguards will apply when external infrastructure is involved. For users, the trade‑off is clear: faster, smarter Siri features, delivered through a more complex privacy story than Apple originally described.
Strategic pivot and what to expect from Siri in 2026
Apple’s decision to run a redesigned Siri on Google Cloud Nvidia chips is a major strategic pivot away from strict end‑to‑end control. It shows that, for large‑scale generative AI, even a company known for vertical integration will lean on external platforms when that yields better performance and time‑to‑market. The Siri AI upgrade 2026 rollout is expected to start alongside iOS 27 as a beta, with limited availability that will likely expand as Apple refines models and infrastructure. Users can expect better context handling, stronger multi‑step reasoning, and deeper integration with apps like Mail, Messages, Calendar, Photos, and Notes, as Siri becomes more aware of personal context. Over time, Apple is also using model distillation to train its own smaller models based on Gemini’s responses. That could let Apple pull more processing back onto its own systems later, even as the first wave relies heavily on Google’s AI cloud.






