MilikMilik

Inside Google I/O: Gemini 3.5 Flash, Omni Video, and a Reinvented AI-First Search

Inside Google I/O: Gemini 3.5 Flash, Omni Video, and a Reinvented AI-First Search

Gemini 3.5 Flash and Omni Video Signal a New Phase of AI

At Google I/O, Gemini moved from headline product to invisible infrastructure. The company officially launched Gemini 3.5 Flash, a lightweight model tuned for speed and efficiency that now powers key AI experiences, including the refreshed AI Mode in Search. While a more advanced Gemini 3.5 Pro is expected later, Flash already positions Google to compete with frontier models while running at far lower latency. On the creative side, Google introduced Gemini Omni, a multimodal system with a strong focus on video. Omni can ingest text, images, audio, and existing clips to generate or dramatically edit video, from changing environments and backgrounds to adding cinematic effects through simple descriptions. It also leans into personalisation, with avatar-like and selfie-based editing features, all marked with SynthID watermarks so AI-generated footage remains identifiable. Together, Gemini 3.5 Flash and Gemini Omni video showcase Google’s push to make AI both faster behind the scenes and more expressive on screen.

Inside Google I/O: Gemini 3.5 Flash, Omni Video, and a Reinvented AI-First Search

AI Search Updates: From Intelligent Box to Unified Conversational Experience

Search received one of its biggest overhauls yet, cementing AI as the default interface to the web. Google is rolling out an “intelligent, AI-powered Search box” globally, which goes beyond autocomplete to anticipate intent and help users phrase questions. Instead of relying solely on keywords, people can now drop in images, video files, or even whole Chrome tabs as direct inputs. AI Overviews have been expanded with back-and-forth conversational responses so users can refine questions without starting over, and Search results can include AI-generated visuals or short explanatory videos. AI Mode, now powered by Gemini 3.5 Flash, sits alongside this experience for deeper follow-up queries and corrections. Together, the intelligent search box, AI Overviews, and AI Mode form a more unified, conversational search journey designed to keep users within Google’s results while delivering richer, multi-format explanations in context.

Inside Google I/O: Gemini 3.5 Flash, Omni Video, and a Reinvented AI-First Search

Gemini Everywhere: Gmail, Docs, YouTube and Agentic Productivity with Spark

Google’s strategy is to make Gemini less of a destination and more of a layer across everyday tools. Gmail is gaining a live voice mode, letting users speak naturally about their inbox and have Gemini summarise threads or draft responses. Docs introduces Docs Live, where people can brainstorm out loud and watch Gemini convert rough ideas into structured documents in real time. YouTube adds “Ask YouTube,” a feature that lets users ask conversational questions to jump directly to relevant moments in videos instead of scrubbing manually. Beyond individual apps, the new Gemini Spark agent brings automation to the ecosystem. Running in the cloud, Spark can organise schedules, pull together notes into Docs, monitor credit card statements for hidden subscriptions, track school emails, and interact with third-party services such as OpenTable or Instacart. An Agent Payments Protocol ensures Spark asks for approval before final purchases or sending sensitive communications, reflecting Google’s attempt to balance autonomy with user control.

Android XR Glasses and the Hardware–Software AI Convergence

Google used I/O to preview how Gemini will live beyond phones and laptops, highlighting a new wave of Android XR glasses. In collaboration with partners including Gentle Monster and Warby Parker, Google showcased two models of Android XR smart glasses that tightly integrate Gemini into real-world experiences. Wearers can chat with Gemini hands-free, receive real-time audio translation in the speaker’s own voice, and see translated text overlaid directly onto their field of view. The glasses can also capture photos on the go, hinting at future use cases where Gemini could provide contextual information or memory-like recall based on what the user sees. Combined with earlier teases of holographic communication and live-translating AR concepts, these Android XR glasses underscore Google’s broader push: blending AI, Search, and Gemini into ambient computing devices where assistance is always available but increasingly invisible.

Ecosystem and Infrastructure: AI Studio, Spark for Developers, and TPU-Backed Scaling

Behind these consumer features is an ecosystem play aimed at developers and enterprises. New tools like the AI Studio app and Gemini Spark are positioned not just as user-facing assistants, but as platforms that creators and developers can build on, connecting workflows across Gmail, Docs, Drive, YouTube, and third-party services. To sustain this expansion, Google and Blackstone announced a major AI cloud venture focused on extending access to Google’s custom Tensor Processing Units as compute-as-a-service. Backed by an initial equity investment of USD 5 billion (approx. RM23,000,000,000), the company plans to bring its first 500 megawatts of data centre capacity online in 2027, with significant scaling to follow. Alongside hardware capacity, Google will supply software and services, extending the same TPU infrastructure already used by partners like Anthropic and Meta. Together, these moves suggest that Gemini’s future depends as much on robust infrastructure and developer tools as on consumer-facing AI features.

Comments
Say Something...
No comments yet. Be the first to share your thoughts!