Google Publishes a Plain-Text Format for Feeding AI Agents Company Knowledge

Data Infrastructure
A new specification for describing company data to AI agents, published as plain text files.
v0.1OKF release version
3sample data sets published
0software required to read it

Files you can open in any text editor. That is what Google Cloud published this week under the name Open Knowledge Format, a way of writing down what a database table means, where a metric comes from, or how to fix a recurring problem, in a format that both people and AI systems can read without special software.

Most companies that have invested in artificial intelligence agents run into the same wall. The agent can write code or summarize a document, but it does not know that "active customer" means something specific to your business, or which table in BigQuery actually holds last week's orders. That knowledge usually lives in a metadata catalog, a wiki, a spreadsheet someone maintains, or in the memory of a few engineers. Open Knowledge Format gives that information a standard shape: a folder of text files, one file per table or metric or process, with a few required labels at the top of each file.

What it actually is, in plain terms

Each file describes one thing, such as a table or a report. At the top of the file are a handful of fields: what type of thing it is, its name, a short description, and when it was last updated. Below that is a normal written explanation, the kind a person would put in a wiki page. Files link to each other the way wiki pages do, so an agent reading about an orders table can follow a link to the customers table it connects to.

Picture an AI agent asked why bookings dropped last week. With Open Knowledge Format in place, the agent opens the file describing the bookings table, follows a link to the file defining what "active customer" means at this company, and follows another link from there to a playbook explaining how the metric is calculated and what usually causes swings in it. Three plain text files, read in sequence, take the agent from a vague question to a grounded answer, without a person walking it through any of that by hand.

None of this requires a database, an account with any vendor, or special tools to create or read. Google published the specification, three example data sets built with it, and a small program that generates these files automatically from a BigQuery data set.

This is not a replacement for the vector databases many AI projects already rely on. A vector database helps a system find content that is related to a question even when the wording does not match, which is useful for search across large amounts of text. Open Knowledge Format does something different: it gives a clear, written description of what a table, metric, or process is and how it connects to other things. The two can work together. A company's table and metric descriptions could be stored as Open Knowledge Format files, and a vector database could be used to help an agent find the right file quickly.

Why this matters for IT leaders

Companies such as Atlan, Alation, and Collate sell metadata catalogs that already do something similar: they organize descriptions of your data so people and systems can find and understand it. Those products are not going away, and they do more than store text files. They handle search, access controls, automated documentation, and tracking how data moves between systems.

What changes is the format underneath. If the descriptions of your data exist as plain markdown files that follow a common standard, those descriptions can move between tools more easily. A description written for one AI agent could be read by an agent from a different vendor without conversion. Whether your current catalog can read and write this format, and whether it locks those descriptions inside its own system, becomes a fair question to ask.

A description written by one tool can be read by a different tool, without conversion, because both are reading the same plain text file.

That portability is the actual contribution here. The file itself, not any particular software for creating or reading it, is what Google is making available.

Reason for caution

This is version 0.1, published by one company, and the example data sets so far were all built using Google's own tools. A specification becoming widely used by other vendors and open source projects is a different thing than a specification being published. Whether other cloud providers or catalog vendors adopt this format, or treat it as a Google format with an open label, will take time to become clear.

For now, this is one company publishing a simple, readable way to describe data for AI agents, and inviting others to use it. The near-term action for most IT teams is small: find out whether the metadata tools you already use can produce or read files in this format.

CIO/CTO Viability Question Ask your metadata or catalog vendor one question this quarter: can their platform export your table and metric descriptions as Open Knowledge Format files today, and if not, when?
Sources:
McVeety, Sam, and Amir Hormati. "Introducing the Open Knowledge Format." Google Cloud Blog, 12 June 2026, cloud.google.com.
Karpathy, Andrej. "LLM Wiki." GitHub Gist, gist.github.com.
Atlan. "Data Catalog for AI: Capabilities, Uses & Tooling in 2026." Atlan, atlan.com.
DataHub. "Context Platform vs. Data Catalog: What's the Difference?" DataHub, datahub.com.
Disclaimer: This blog reflects my personal views only. Content does not represent the views of my employer, Info-Tech Research Group. AI tools may have been used for brevity, structure, or research support. Please independently verify any information before relying on it.