Field Note

Thinking Tools and Infrastructure

Foundation models and a private RAG are not the same product, and the firms that treat them as competitors end up buying the wrong things in the wrong order. A field note on the reframe — and the hybrid stack that follows from it.

Back to all insights
A tug-of-war between public AI models and a private AI — the framing the article reframes.

A procurement question is working its way through every enterprise this year: "we already give our team access to a foundation model — why do we need a private RAG on top of that?" The honest answer is that the question contains a category error. A general-purpose model an employee opens in a browser is a thinking tool. A retrieval system embedded in workflows, products, and decisions is infrastructure. They look similar in the same chat box, and they are not the same thing. The firms that get the most leverage out of both this year will be the ones that stop treating them as substitutes.

The reframe

A salesperson using a foundation model to brainstorm objection-handling is doing something fundamentally different from a customer-facing answer that has to come from the firm's authoritative documents — under permissions, with citations, in a system the compliance team has signed off on. The first is iteration. The second is operations. Treating them as the same purchase produces awkward outcomes: a "buy versus build" debate where the right answer is "both, for different jobs."

Foundation models will keep getting better at the iteration use case. That is not the part to fight. The cleaner posture is to concede that ground openly — the team should use the model for thinking — and lean hard into what a general assistant structurally cannot do.

What general-purpose models cannot do

Four capabilities separate a private retrieval system from a chat session, and none of them are about the model itself. They are about what sits around the model.

First, the corpus. A foundation model sees what someone uploads in a session. A private RAG sees the entire corporate knowledge base — millions of documents, continuously indexed, with permissions and access controls intact. For anything where the answer must come from authoritative internal sources, the model on its own is either guessing or asking the user to feed it context every time.

Second, auditability. For regulated workflows, customer-facing answers, and anything that ends up cited back, a system has to produce traceable retrievals, version control, and source attribution that holds up under scrutiny. A consumer chat session has no equivalent record. The audit trail is not a UX feature; it is a procurement requirement.

Third, embeddability. Retrieval infrastructure lives inside applications, support tools, agents, and customer experiences. It is not a tab someone opens — it powers things. That is a different conversation with a different buyer, and the comparison to a per-seat consumer license stops being meaningful.

Fourth, specialization and evaluation. Retrieval can be tuned to a firm's domain, evaluated on its specific tasks, and improved over time against metrics that matter to the business. A general-purpose model is general by design — useful for almost everything; optimal for almost nothing.

A foundation model is a tool an employee uses. A private RAG is a system the business runs on. The procurement question dissolves once the difference is on the table.

Map the jobs

The most useful artefact in this conversation is a two-column map. Left column: the jobs a foundation model should keep doing — open-ended ideation, drafting, exploring an unfamiliar topic, working through a problem with a colleague, synthesising a few documents an analyst already has on screen. Right column: the jobs that need a private RAG — anything customer-facing, anything that requires the firm's proprietary corpus, anything compliance-sensitive, anything embedded in a product or automated workflow, anything where the same question gets asked thousands of times by different people.

Once that map is on the table, the "which one do I buy?" question rarely survives the meeting. The columns answer different problems; the firm needs both, and stops trying to force one through the other.

The same two teams, now linked by chain — captioned "Public + Private Cooperation: A Shared Future."
The same two teams, now pulling on the same chain. The shared future is not one tool replacing the other; it is each doing what only it can.

The hybrid stack

The strongest positioning available right now is not "use us instead of the model." It is "use us to make the model safe to use on your data." When a private RAG is exposed as an MCP server — or any equivalent tool surface — employees can use the foundation model as their thinking environment and pull authoritative answers from the firm's system when an answer needs to come from somewhere defensible. The general-purpose model does what it is good at. The retrieval system does what it is good at. Neither pretends to be the other.

That hybrid posture also flips the IT and security conversation. A foundation model used directly on enterprise data is the unbounded surface; the same model paired with a governed retrieval layer is the controlled one. The argument with security stops being about whether to allow the model and starts being about how to channel its use through a system that produces an audit trail.

The honest version of the pitch is short. The team is going to use a foundation model — they should, it is a genuinely good thinking tool, and the model layer is going to keep improving on that. The moment an answer needs to come from the firm's data, be auditable, be embedded in a product, or be served at scale to customers, that capability needs a different system underneath it. Not a competitor. A complement. The layer that turns the model from a personal productivity gain into something the business can actually run on.

Want to talk to the team behind this work?

Get in touch