Google - The Search Company Becomes an Agent Company

analyst-post 9 min 2026-05-19

3.2Q+ Monthly tokens processed

7× YoY token growth

8.5M+ Developers on Gemini monthly

$180–190B 2026 capex forecast

900M Gemini app MAUs

Two years ago, Google was processing 9.7 trillion tokens a month. Today that number is 3.2 quadrillion, a 7x year-over-year jump that Sundar Pichai used to open Google I/O 2026. The number is not a vanity metric. It is a workload signal telling you how deeply Gemini has embedded itself into production systems, developer pipelines, and consumer products simultaneously. Pichai framed the moment as a pivot point: people want to see AI working in the products they use every day, not in demonstrations. I/O 2026 was Google's attempt to prove that pivot is complete.

For a full read of the infrastructure argument, including the dual-chip TPU 8t and TPU 8i architecture announced at Cloud Next and the agent governance layer from Google Cloud, the prior posts on this site cover that ground in detail. This post focuses on what I/O 2026 adds on top of that foundation, and what it means for enterprise technology leaders evaluating Google's position.

Antigravity Is the Platform, Not a Tool

Most of the consumer announcements at I/O 2026 run on a single internal platform called Antigravity. Google describes it as an agent-first development platform, but the more accurate framing is that Antigravity is the orchestration layer connecting Gemini models to persistent, long-horizon tasks. Gemini Spark, the new personal agent in the Gemini app, runs on it. The custom dashboards and trackers being built inside Search run on it. The new developer release, Antigravity 2.0, ships as a standalone desktop application that lets anyone manage cohorts of autonomous agents.

The business logic is straightforward. Google has spent years building model capability. The constraint was always orchestration: how do you keep an agent working reliably on a complex task across hours or days without human re-prompting? Antigravity is the answer Google built for itself first. Google's internal token processing on its own AI developer tools reached three trillion tokens per day by the time of the keynote, doubling every few weeks from half a trillion in March. That internal scale created the feedback loop that improved Gemini 3.5 Flash before it was released publicly. The sequence matters: Google ate its own cooking at a scale most enterprises won't reach for years, then opened the kitchen.

Google ate its own cooking at a scale most enterprises won't reach for years, then opened the kitchen.

Gemini 3.5 Flash: The Cost Argument Is the Strategic Argument

Gemini 3.5 Flash is the first model in the 3.5 series and the centerpiece model announcement of I/O 2026. It surpasses the previous 3.1 Pro across coding, agentic, and multimodal benchmarks while running four times faster on output tokens per second than comparable frontier models. Those performance claims matter. The cost claim matters more.

Google's own framing: top companies are now processing roughly one trillion tokens per day. If they shifted 80 percent of their workloads from other frontier models to Gemini 3.5 Flash, the annual savings would exceed one billion dollars. That is not a performance pitch. That is a procurement argument aimed directly at the CFO conversation happening alongside every AI infrastructure renewal. Google is betting that enterprises running out of token budget by mid-year, which Pichai explicitly named, will rationalize their model mix around Flash for the majority of workloads and reserve higher-cost models for tasks that genuinely require them. Flash becomes the default tier. Everything else becomes a justifiable exception.

Gemini 3.5 Pro is in internal testing and arrives next month. The expectation is that it handles the reasoning-intensive tasks Flash hands off. The two-model architecture within the 3.5 family mirrors the TPU split between training and inference: different tools for different workload economics.

Gemini Spark and the 24/7 Agent Bet

Gemini Spark is the consumer-facing agent built on Gemini 3.5 and the Antigravity harness. It runs on dedicated virtual machines on Google Cloud, operates continuously without requiring the user's device to be active, and starts with integration across Gmail, Docs, and other Workspace apps before expanding to third-party tools via Model Context Protocol later this summer. Beta access goes to Google AI Ultra subscribers in the United States next week.

The design decision worth noting: Spark surfaces its progress through a new Android interface called Android Halo, which shows subtle agent status updates at the top of the phone screen without interrupting whatever the user is doing. That is a deliberate choice about how ambient agents communicate. The alternative, an agent that demands attention to report progress, would undermine the core premise of background autonomy. Google is betting that users want agents to be invisible until the output arrives.

For enterprise buyers, the more consequential signal is the MCP integration roadmap. Spark starting with Google's own tools is the controlled launch strategy. The opening to third-party MCP integrations is the signal that Google intends Spark to function as a general-purpose agent layer, not a Workspace companion. That distinction will define whether Spark competes with productivity tools or with the agent orchestration platforms enterprises are currently evaluating separately.

Search Gets an Agent Layer and Keeps the Box

The Search announcements at I/O 2026 run in two directions simultaneously. The first is the redesigned Search box, now described as the biggest upgrade to Search in 25 years, which expands as you type and offers intent-anticipating query suggestions beyond standard autocomplete. The second is information agents: background agents that monitor the web, news, social posts, and real-time data feeds continuously on behalf of the user, triggering updates when something relevant changes. Both directions reflect the same underlying shift: Search moving from a transaction, you ask, it answers, to an ongoing process.

The information agents announcement is the one CIOs should track. The explicit description is that your agent will look across blogs, news sites, social posts, finance, shopping, and sports data for changes related to your specific question. That is competitive intelligence infrastructure built into a consumer product. When those capabilities reach enterprise Search integrations, the monitoring workflows that currently require dedicated tools and analyst time become table stakes.

Google is also building what it calls mini apps inside Search: custom dashboards and trackers for continuous tasks, generated by Antigravity in response to user queries, persistent and returnable. These arrive in the coming months for AI Pro and Ultra subscribers. The implication is that Search is acquiring a canvas, not just a results page.

Universal Cart and the Commerce Protocol

Universal Cart is Google's agentic shopping hub, spanning the Gemini app, YouTube, and Gmail, with Search to follow this summer. It finds deals, tracks price history, flags stock changes, and cross-references compatibility between items in the same cart. The custom PC build demonstration, where the cart flagged component incompatibilities proactively, is the clearest example of what agentic commerce looks like when it moves beyond price comparison.

The infrastructure underneath it is the Universal Commerce Protocol, an open standard built with Shopify, Walmart, and Amazon. The protocol solves the integration problem that has historically constrained AI-powered shopping: every merchant had a unique API, which meant every AI shopping assistant required hand-coded integrations for each retailer. The Universal Commerce Protocol levels that surface. Once a merchant implements it, any compatible agent can interact with their inventory, pricing, and checkout systems without bespoke engineering. Google is building the standard before the agent market matures, which is the same sequence it used with Android and the open web.

SynthID Expands the Coalition

Three years after launch, Google DeepMind's SynthID watermarking system has processed over one hundred billion images and videos and sixty thousand years of audio assets. At I/O 2026, Google announced that OpenAI, Kakao, and ElevenLabs are joining NVIDIA in adopting SynthID. Google is also expanding Content Credentials verification, which distinguishes AI-generated content from camera originals, to Search and Chrome.

The coalition expansion is significant precisely because OpenAI is included. Two competing AI platform companies sharing a watermarking standard signals that at least one layer of the AI trust problem is being treated as pre-competitive infrastructure. Whether that holds as the market consolidates is a separate question. For now, the practical effect is that SynthID detection will cover a broader share of AI-generated content as these companies ship into 2026 and beyond.

Hardware: Intelligent Eyewear Arrives This Fall

Google's intelligent eyewear, built on Samsung hardware with Qualcomm silicon and designed externally by Gentle Monster and Warby Parker, will arrive as audio glasses this fall. They pair with both Android phones and the iPhone. Display glasses, which surface contextual information in the field of view, come later.

The hardware strategy is consistent with the broader platform argument. Audio glasses running Gemini Live create a persistent ambient AI interface that does not require a screen interaction. Every question asked through those glasses is a token processed and a data point about real-world context that no desktop or phone interaction generates. The hardware is not incidental to the AI strategy. It is the data acquisition layer for the next generation of personalization.

What This Means for Enterprise Technology Leaders

The Google I/O 2026 announcements land on top of a foundation this site has covered in detail: the dual-chip TPU architecture separating training and inference economics, the $240 billion Google Cloud backlog signaling committed enterprise demand, and the agent governance layer built at Cloud Next. I/O adds the consumer-facing execution layer on top of that infrastructure.

The pattern that emerges is vertical integration moving up the stack. Google controls the chip, the data center, the model, the orchestration platform, the consumer distribution channel, and now the commerce protocol. The enterprise buyer who evaluates Google Cloud for AI workloads is not evaluating a cloud vendor. They are evaluating a company that is simultaneously the infrastructure provider, the model supplier, the agent platform, and the end-user distribution channel for the outputs those agents produce.

That is a structural position no other vendor can replicate in full. It is also, for the same reason, the most consequential dependency decision an enterprise technology leader will make in 2026.

CIO/CTO Viability Question

Google is pricing Gemini 3.5 Flash to win the cost rationalization argument that will define enterprise AI budget allocation in the second half of 2026. If your organization is heading into a token budget conversation, the question is not whether Flash is good enough for most workloads. It almost certainly is. The question is whether you are willing to let your model tier decision deepen your Google Cloud dependency at the exact moment Google is also building the agent platform, the commerce protocol, and the ambient hardware layer that will touch every part of your technology stack. That is not a model evaluation. That is a vendor relationship decision with a ten-year horizon.

Pichai, Sundar. "I/O 2026: Welcome to the Agentic Gemini Era." Google, 19 May 2026, blog.google.

Li, Abner. "Everything Google Announced at I/O 2026: Gemini, Search, Android XR, & More." 9to5Google, 19 May 2026, 9to5google.com.

Disclaimer: This blog reflects my personal views only. Content does not represent the views of my employer, Info-Tech Research Group. AI tools may have been used for brevity, structure, or research support. Please independently verify any information before relying on it.

Shashi.co