Agent Traffic Has a Different Shape. AWS Had to Rebuild Its Search Engine Around It.

97% of new architecture built from ground up (AWS)

20x faster autoscaling vs. prior generation (AWS)

60% lower cost vs. peak-provisioned clusters (AWS)

Key Takeaway

AI agents produce a work pattern that existing enterprise infrastructure was never designed to handle: intense bursts of activity followed by long stretches of silence. AWS discovered this broke the economics and performance of its search service, and rebuilt it from scratch. The same pattern is breaking assumptions across every layer of enterprise infrastructure.

Most enterprise AI projects are built on infrastructure that was designed for a different kind of user. Not a bad design, just the wrong one for what agents actually do. Amazon Web Services announced on May 28, 2026 that it had rebuilt one of its most widely used services almost entirely from the ground up, because the old version could not handle agent workloads without failing on cost and performance. The announcement is worth understanding even if you never touch the service directly, because the problem it describes is everywhere in enterprise infrastructure right now.

Background: What is enterprise search infrastructure?

When an AI agent answers a question, it does not simply generate a response from memory. It searches through large volumes of your organization's data, finds the most relevant pieces, and uses those to construct an accurate answer. That retrieval step requires a dedicated search engine running in the background, one fast enough to return results in milliseconds so the agent's response feels instant.

OpenSearch is one of the most widely used search engines in enterprise technology. It is open-source software, meaning anyone can use and modify it, and it powers search and data retrieval inside thousands of enterprise applications. Amazon OpenSearch Serverless is AWS's managed version of that engine, where AWS handles all the infrastructure so companies do not need to run it themselves.

Think of it like the index in the back of a book, except it covers every document, database, and data source your company holds, and an AI agent can search it thousands of times per minute.

Background: What does "serverless" mean for your budget?

"Serverless" is a billing and infrastructure model. In a traditional setup, you reserve a fixed amount of computing capacity, like leasing office space, and pay for it whether it is fully used or sitting empty. Serverless means you pay only for the compute you actually consume, like paying for a taxi ride instead of owning a car.

The catch is that the system still needs to respond quickly when demand arrives. If it takes five minutes to spin up capacity after a request comes in, it is not useful for real-time applications. The engineering challenge of serverless is making the service both cheap when idle and fast when busy.

Why Agents Break the Economics of Traditional Infrastructure

Human users of software create relatively smooth, predictable traffic. People log in, use an application for a while, and log out. Demand has patterns: peaks during business hours, quieter evenings, consistent baselines. Infrastructure can be sized for those patterns with reasonable efficiency.

AI agents work completely differently. An agent reasoning through a complex task fires a rapid sequence of search queries, sometimes hundreds in a few seconds, retrieves data, processes results, fires more queries, and then goes completely silent until the next task arrives. That silence can last hours.

This burst-then-idle rhythm destroys the economics of infrastructure sized for smooth, human-paced demand. A company that provisions enough capacity to handle the agent's peak activity is paying for that capacity during all the idle hours. A company that provisions for average demand gets slow, unreliable responses the moment an agent actually needs the infrastructure. Neither outcome is acceptable in production.

"Predominantly, OpenSearch has been the Swiss Army knife, a hodgepodge of everything. We even tried to do a pivot into SIEM last year. That detour did not stick." — Tia White, General Manager, OpenSearch, AWS

What AWS Actually Built and Why It Required a Near-Total Rewrite

Tia White, who became general manager for OpenSearch at AWS in February 2026, described the scope of the rebuild plainly: approximately 97 percent of the new service was built from the ground up by AWS engineers. The core architectural change was separating the storage of data from the computing power used to search it. Previously, the two were bound together, which made scaling decisions slow and expensive.

With storage and compute separated, the service can now add search capacity in seconds rather than minutes, and shed that capacity just as quickly when the agent finishes its work. AWS says the new version scales 20 times faster than the previous generation and can reduce costs by up to 60 percent compared to running a provisioned cluster sized for peak demand. Collections of data can now scale entirely to zero when idle, meaning companies pay nothing during quiet periods.

Key Takeaway

The rebuilt service is only available through AWS's managed offering. Organizations running their own self-hosted version of OpenSearch do not get the faster scaling or scale-to-zero economics. If your team chose self-hosted OpenSearch for cost reasons, that calculation may have changed.

The Open-Source Gap Your Team Should Know About

OpenSearch started as an open-source project, meaning the code is publicly available and organizations can run it on their own servers. Many enterprises chose that path specifically to avoid vendor lock-in or to control costs. The rebuilt version AWS launched does not flow back to the open-source project. The new proprietary storage layer, the compression capabilities that drive the cost savings, and the fast autoscaling mechanism are exclusive to the managed AWS service.

This matters for any organization currently running OpenSearch on its own infrastructure. Self-hosted deployments will not inherit the performance or cost characteristics AWS is citing. The 20-times autoscaling improvement exists only inside AWS's managed environment. An infrastructure review that assumed rough cost parity between self-hosted and managed OpenSearch should revisit that assumption.

This Is the Same Problem Cisco Documented at the Network Layer

Earlier this month, Cisco published research measuring what agent workloads do to enterprise networks. The central finding: when an AI agent handles a task instead of a human, the task generates 450 percent more network traffic. Without agents embedded in enterprise workflows, corporate network traffic is expected to grow roughly 2.5 times over the next decade. With agents, that growth becomes approximately 9 times.

The mechanism is identical to what AWS encountered inside its search infrastructure. An agent does not make one polite request and wait for a response. It fires a sequence of calls, each generating traffic, then goes idle. Cisco documented that pattern at the network layer. AWS hit it at the search and retrieval layer. Both were forced to reckon with infrastructure built for human-paced demand.

The implication extends beyond any single vendor or service. Every component of enterprise infrastructure that was sized for human traffic patterns will eventually face some version of this recalculation. Network capacity, search and retrieval, databases, identity and access management, monitoring systems: all of them were designed around the assumption that people, not machines, are the primary users. That assumption is breaking down faster than most infrastructure roadmaps anticipated.

That prior analysis is at shashi.co.

What AWS Is Building Next

The rebuilt service launches with integrations for Vercel and AWS Kiro, Amazon's integrated development environment. A set of OpenSearch Agent Skills connects the service to AI coding tools including Claude Code and Cursor, allowing agents to retrieve information through those tools without custom integration work.

The roadmap includes a major log analytics launch in June, putting the service into a market currently occupied by established monitoring platforms. A long-term memory capability for agents is scheduled for the second half of 2026, designed with evaluation and governance built in from the start. White was explicit that governance cannot be retrofitted after the fact on agentic infrastructure.

AWS also addressed the question that will eventually face every enterprise search investment: can large language models simply replace a dedicated retrieval engine as model capabilities improve? Their answer is that OpenSearch Serverless becomes the layer that makes model outputs reliable rather than plausible. A reasoning model without an accurate retrieval layer produces fluent, confident, wrong answers. That dependency on retrieval infrastructure is AWS's bet on why the service remains relevant even as model context windows grow.

CIO / CTO Viability Question

Two questions worth putting to your infrastructure team before the next planning cycle. First: are your enterprise search and data retrieval systems provisioned for human-paced demand, and have you measured what happens to cost and latency when agent workloads hit them at scale? Second: if your organization runs self-hosted OpenSearch for cost control, has anyone recalculated that decision against the new managed service economics? The 60 percent cost reduction AWS cites is against provisioned peak clusters. The honest comparison requires knowing what your actual agent traffic looks like, and most organizations have not measured that yet.

Sources

Lardinois, Frederic. "Why AWS Scrapped OpenSearch's Architecture to Chase Agent Workloads." The New Stack, 28 May 2026. thenewstack.io.
Katariwala, Sohaib, Arjun Nambiar, and Raj Ramasubbu. "The Next Generation of Amazon OpenSearch Serverless: Built from the Ground Up for Agents." AWS Big Data Blog, 28 May 2026. aws.amazon.com.
Amazon Web Services. "Introducing the Next Generation of Amazon OpenSearch Serverless for Building Your Agentic AI Applications." AWS News Blog, 28 May 2026. aws.amazon.com.
Bellamkonda, Shashi. "Cisco's WAN Research Says the Internet Wasn't Built for Agents." shashi.co, May 2026. shashi.co.

Disclaimer: This blog reflects my personal views only. Content does not represent the views of my employer, Info-Tech Research Group. AI tools may have been used for brevity, structure, or research support. Please independently verify any information before relying on it.

Shashi.co