The Economics of RAG Agents in Production

10 Lessons for RAG Agent Success in Enterprise AI

Apr 18, 2025

This week I’m by the pool and it’s given me the chance to catch up on a lot of the latest in the Agentic space from the professional front line.

After watching Douwe Kiela's (CEO of Contextual AI and pioneer of RAG technology) presentation on RAG agents in production, I'm struck by how his insights align with what I'm observing in my own work: context is the key battleground for unlocking real enterprise value with AI.

The Context Paradox

Kiela points to what he calls the "context paradox" as the central issue. Language models excel at tasks most humans find challenging—generating code, solving complex math problems, producing creative content. Yet they struggle with what humans do effortlessly: putting problems in the right context based on our expertise and institutional knowledge.

This paradox explains why many enterprise AI deployments remain stuck in the "convenience" quadrant (the place where most of the “experts” on LinkedIn play) rather than delivering truly differentiated value.

As you climb the value ladder toward business transformation, the contextual demands increase exponentially and the low-code now-code platforms all the grifters are using become obsolete.

10 Critical Lessons for Deploying RAG Agents in Enterprise Settings

Let's examine the key insights from Kiela's experience deploying RAG systems at scale, and how they relate to the economics and practical realities of building viable AI agent systems:

1. Systems Trump Models

Language models may get all the attention, but they're only about 20% of an effective enterprise AI solution. The RAG pipeline around the model—how you retrieve, process, and apply information—matters far more than having the latest LLM.

A great system with a mediocre model will outperform a great model with a mediocre system every time.

This remains the biggest problem with the constant shitposting on LinkedIn - nobody really cares what model you use in the wider context of the architecture and business objective of your solution. I continue to be stunned at how many people “like” such crap.

2. Specialise

While general-purpose AI assistants get the headlines, specialised systems that capture domain-specific expertise deliver more value in enterprise settings.

This specialisation creates a competitive moat that generic AI cannot match, particularly for domain-specific problems where the context is crucial.

This is exactly what I was talking about in this article where I introduce the concept of the Repeatable Expertise Pattern for Agentic use cases and it’s what I advocate for in my business.

3. Scale With Your Data

Your company's data isn't just an asset—it is your company in the long term. Why? Because people leave, but you data doesn’t.

The challenge isn't perfecting your data before AI can use it; it's building AI that can work effectively with your imperfect, messy data at scale. Companies that succeed here develop a significant competitive advantage.

4. Design for Production, Not Pilots

This lesson resonates deeply with my work on AI agent economics. Pilots are deceptively easy—building a demo RAG system with a few documents and ten users is trivial. But scaling to tens of thousands of documents, thousands of users, and dozens of use cases while meeting enterprise security and compliance requirements?

That's where most initiatives falter.

As I've documented in Part 4 of the Anatomy series, bridging this gap requires planning for production economics from day one—understanding the full stack of sovereignty costs, consumption patterns, and staffing requirements that real-world deployments demand.

5. Prioritise Speed Over Perfection

When deploying RAG agents, iteration speed trumps initial perfection. Getting your solution in front of real users early—even when it's barely functional—provides the feedback necessary to improve rapidly. This "hill-climbing" approach to improvement produces better results than attempting to design the perfect system before launch.

6. Abstract Away the Boring Stuff

Don't waste engineering talent on mundane tasks like optimising chunking strategies or basic prompt engineering. These are solvable problems that can be abstracted away by platforms. Engineers should focus on delivering business value through differentiated features and applications, not reinventing RAG fundamentals.

While I agree with Kiela that this isn’t the best use of talent’s time, the purpose of your project always needs to be carefully considered and sometimes, it’s difficult to pull yourself away form the boring bits in an effort to understand what actually works best for a given use case.

7. Make AI Easy to Consume

Even the most powerful AI is worthless if nobody uses it. Successful RAG deployments integrate seamlessly into existing workflows rather than requiring users to adopt entirely new systems. The more frictionless the experience, the higher the adoption rate and subsequent value creation.

8. Design for the "Wow" Moment

The most successful AI deployments create that spark moment when users suddenly understand the technology's potential.

For RAG systems, this often happens when they surface valuable information users didn't know existed—like the Qualcomm engineer who discovered a seven-year-old document answering questions that had long gone unanswered.

9. Focus on Observability, Not Just Accuracy

Perfect accuracy is unattainable, so successful AI systems focus on managing inaccuracy through robust observability and attribution.

Proper audit trails showing exactly why an AI generated a particular answer become critical, especially in regulated industries. The ability to verify claims and trace them back to source documents is essential for trust. This is another reason why the use of ontology and graph architectures are absolutely essential in Agentic solutions.

If you’re not using a graph with your Agent then it’s not an Agent. Period.

10. Be Ambitious

Perhaps most importantly, many AI projects fail not because they aim too high, but because they aim too low.

Building a chatbot to answer basic HR questions about vacation days isn't going to transform your business. The truly valuable use cases address complex problems where AI's capabilities can create exponential returns.

The Economics Reality Check

These insights align perfectly with what I've been documenting in my work on AI agent economics and what I’m doing in my own business with real customers.

The question isn't just "can we build this?"—it's "will this deliver ROI?"

As I've shown in my analyses of agent costs, there are dramatic differences between pilots and production systems in terms of:

Sovereignty costs: The infrastructure and expertise needed to run RAG at enterprise scale
Consumption costs: The variable expenses of running inference, generating embeddings, and operating retrieval systems
Labor costs: The human expertise required to build, maintain, and supervise these systems

The difference between a pilot that impresses executives and a production system that delivers millions in value comes down to understanding and managing these economic factors from the beginning.

The Industrial Revolution Parallel

Kiela makes a compelling comparison to the Industrial Revolution—a parallel I've explored in my own writing.

Just as machines transformed manual labor in the 18th century, displacing artisans but ultimately creating new roles and raising living standards, today's AI revolution is poised to reshape knowledge work.

The economics are undeniable. As I documented in my case study of StratEdge Consulting, replacing research analysts with AI agents led to a 99% reduction in production costs and the potential to add $3.6 million in annual revenue through increased output.

The question isn't whether this transformation will happen, but how organisations will manage it and what they do with the human capital they already have.

Bridging the Context Gap

What all of these lessons point to is a central truth: successful enterprise AI isn't just about technology—it's about bridging the context gap between general-purpose AI capabilities and the specific, domain-expert knowledge that organizations need to apply to their unique problems.

RAG is the bridge we're building between these worlds. By connecting language models to our enterprise knowledge, we create systems that combine AI's computational power with our institutional expertise. But as these ten lessons show, building this bridge requires much more than technical knowledge—it demands a systematic approach to production, a focus on real business value, and a clear-eyed view of the economics involved.

For those of us working in this space, the opportunity is enormous. But it will only be realised by those who approach AI as a system to be designed, deployed, and refined with business transformation—not just technical capability—as the north star.

The future isn't just coming. It's already here, hidden in the mist between what the grifters are posting and what executives actually want.

Until the next one, Chris.

🧰 Ready to Go Deeper?

The Agent Architect’s Toolkit is my full library of premium agent design patterns, prompts, scorecards, and case studies, built from real enterprise deployments.

If you’re serious about designing or deploying agents in production, this is where you start.

Get the Agent Architect's Toolkit