Artificial Intelligence (AI) is everywhere. It’s in our phones, in our homes, and (unfortunately) in the mouth of every LinkedIn "thought leader" who’s just discovered prompt engineering.
But one term that keeps cropping up—without much clarity—is "AI Agent."
What is an AI agent exactly, and why does it matter?
AI agents are not just chatbots with fancy names. They are autonomous systems that perceive, decide, and act in a given environment. Some are useful, some are overhyped, and some are downright pointless.
Today is the first of what will (probably) be a 10-part series cutting through the fluff on what is arguably the most overhyped concept in tech.
By the end of the series, you’ll not only understand how AI agents work, but also what are the benefits of AI agents in real-world applications. More importantly, you’ll have a good understanding of how to build an AI agent and be clear on how doing this successfully is about a lot more than coding.—it’s about architecture, economics, governance and actual business impact.
So, if you want to learn about this subject from someone who’s actually building with this technology every day, stick around.
The Evolution of AI Agents
AI agents are nothing new. The idea dates back to the earliest rule-based systems, where software followed strict, pre-programmed instructions. Then came machine learning, allowing agents to make decisions based on data rather than hardcoded logic. Now, we’re in the age of LLM-powered agents allowing us to build solution that can reason, learn, and take action with minimal human oversight.
To understand the shift, let’s look at the evolution of this technology in levels.
Level 0 - Basic Automation
In the beginning, we had systems that were about as intelligent as a brick. Think of a vending machine: you pop in some coins, press a button, and out comes your snack. No thinking, no adapting—just a straightforward action-reaction setup.
Level 1 - Rule-Based Systems
Next up, we got a bit fancier with rule-based systems. These are like those old-school spam filters that check if an email contains certain keywords and then decide to toss it into the junk folder. They're following a set of "if-this-then-that" rules laid out by humans. Useful, but not exactly mind-blowing.
Level 2 - Machine Learning Models
Then came machine learning—a game-changer.
Instead of following rigid rules, these systems learn from data. It's like teaching a dog new tricks: the more you train it, the better it gets. Predictive text on your phone? That's machine learning at work, guessing your next word based on your typing habits.
Level 3 - Large Language Model (LLM)-Powered Agents
Now we're in the era of LLM-powered agents. These understand context, generate human-like text, and make decisions that seem almost... thoughtful. They're behind chatbots that can hold a decent conversation and tools that can write essays or code snippets.
Impressive, but they're still just predicting patterns based on vast amounts of data.
Levels 4 & 5 - Autonomous AI Agents
Looking ahead, we aim for fully autonomous AI agents—systems that can plan, adapt, and collaborate without human intervention. Imagine a personal assistant that not only schedules your meetings but also anticipates conflicts, reschedules on the fly, and even books your travel, all without bothering you.
This is where we are today. Kind of. This generation of agent can functionally accomplish a lot of the objectives I’ve just mentioned, but testing and scaling such technology in production is a big challenge.
So, as you can see, AI agents have come a long way from simple, rule-following programs to complex systems capable of learning and decision-making. However, despite the hype, true autonomy remains a work in progress. Today's AI agents excel in specific tasks but lack the general intelligence to operate across diverse domains without human guidance.
What AI Agents Can (and Can’t) Do
Many companies (and “experts” on LinkedIn) claim to have built "autonomous" AI agents that can do anything.
Reality check: the vast majority cannot and do not work “autonomously“.
Most current AI agents are task-specific—trained for one thing and one thing only. This video 👇 from Arseny Shatokhin does a great job of making this point - his clients want vertically integrated agentic solutions, not horizontal “Yoda runs my business” rubbish.
This shouldn't be some kind of shocking revelation to anyone in IT because an AI agent is a piece of software and software is developed with requirements. If the person paying for it has a use case that’s task specific, why would you want something that can write blog posts, manage your calendar, and trade crypto at the same time?
For now, you should think of AI agents performing at their best when:
They operate in controlled environments. Take a customer service desk: tickets follow a defined workflow, there are established response templates, and SLAs provide clear boundaries. In such environments, AI agents can efficiently manage tasks like ticket routing, drafting initial responses, and following up on resolved issues. An agent can handle straightforward queries, escalate complex cases to human agents, and maintain consistent response times.
However, introduce the unpredictability of the real world—like a sudden crisis that changes customer needs, or complex emotional situations requiring deep empathy—and agents will need to defer to human judgment. The key is that the agent knows its operational boundaries and can work effectively within them while recognising when it needs human intervention.
They don’t require deep reasoning. When it comes to complex decision-making, such as strategic financial planning or nuanced business judgments, they're out of their depth. Your CFO's job is safe; agents isn't about to take over roles that require deep reasoning, experience, and intuition.
They have clear, quantifiable goals. AI performs best when objectives are well-defined and measurable. In fraud detection, for instance, the goal is clear: identify and flag fraudulent transactions.
However, in areas like creative work—be it writing, art, or design—success becomes subjective. Evaluating creativity isn't straightforward, and AI struggles to meet expectations in tasks where goals aren't easily quantifiable.
So What Exactly Does an AI Agent Look Like?
OK, I’ve warmed you up and made a few important points, now it’s time to get into the weeds a bit - What does the Anatomy of an AI Agent look like, and why?
Here’s my answer. Based on the second generation of my own framework.
There’s a lot to unpack here, so let’s take our time via a bottom up approach.
The problem with most AI agent implementations is that they’re not very well thought out and way too many of them use low-code no-code technology, which is about as useful as a screen door on a battleship.
In reality, a proper AI agent needs three fundamental pillars of capability, all working together like a well-oiled machine.
#1 Core Components - The Engine Room
Think of this as the agent's nervous system. Here's what's actually happening under the hood:
The Central Runtime is the conductor, but it's not just calling the shots - it's orchestrating the relationships between memory, tools, and LLM responses.
The Guardrail Manager is the bouncer - and trust me, you need one. It's not just about keeping the agent from going rogue; it's about ensuring every action follows the correct rules and governance. I’ll be writing a dedicated piece on this subject as part of this series, it’s arguably the most important element of agent development.
The Tool Manager isn't just a fancy name for a function caller, it's handling dynamic tool registration, managing execution contexts, and dealing with all the faff around argument sanitisation and dependency resolution. It's more like a workshop manager than a simple set of hands.
You can think of the Memory Manager is the Borg Hive Mind - It's not just remembering stuff; it's building up a proper knowledge base that helps the agent learn from past experiences and make better decisions. More on this in a minute.
The final component is LLM Manager - the thinking engine that’s integrated with the rest of the system. It's not just making API calls to OpenAI or Anthropic - it's managing prompt contexts, handling different model temperatures for different types of tasks, and importantly, it knows its place in the wider architecture.
#2 Storage - The Filing Cabinet
Your agent needs more than just a good memory. I wrote a whole piece about this recently here. Agents need skills (learned behaviours or “training” as some might call it), memories (past experiences), reflections (learning from those experiences), journals (tracking what it's done), and to-do lists (because even AI needs to stay organised).
Without this, you've just got a fancy chatbot with delusions of grandeur.
The most important design principle to agent memory that makes all the difference is how you make it available to all the surrounding components. Like the Borg, agents work best when they have access to a Hive Mind with a central planner overseeing the objective. This allows the Drones to execute tasks related to the goal and report back what they’ve done and keeping everyone in sync.
Part 2 of this series will be on How an AI Agent Thinks so I’ll park this topic here for now.
#3 Tools - The Swiss Army Knife
This is where the rubber meets the road.
Some frameworks provide tools to an agent as dumb API wrappers. That works, but I’ve not designed mine like this. My tools are fully-fledged components with their own memory management, guardrails, and decision-making capabilities. This is essential to make the system work with the Hive Mind concept.
Take the Web Searcher tool. It's not just firing off Google queries - it's decomposing goals into optimal search strategies, avoiding duplicate searches by checking its memory, cleaning and consolidating content, and even reflecting on whether it's found enough information to meet the goal. Each tool is more like a skilled specialist than a simple utility.
The Document Builder isn't just spitting out Word docs either. It's taking the agent's "thoughts," applying proper enterprise formatting, managing templates, and even handling things like headers, footers, and table of contents. More importantly, it's keeping track of what it's created in the journal system, making sure we don't duplicate work and can build on previous outputs.
I’ve Not Seen it Done Like this Before…
No, I’m sure you haven’t…
The simpler definitions of AI agents usually characterise them as being decision-making LLM systems, the LLM being the star of the show. I don’t see it that way at all. I see agents as comprehensive software ecosystems with UI options, guardrails (bounded agency), memory, learning, reflection, and tool manipulation capabilities.
What’s more, by breaking an agent architecture into microservices from the pillars, we're not just following some trendy architectural pattern - we're solving actual problems. Each component can scale independently, security can be properly managed, and Bob's your uncle, you've got a system that actually works in the real world, not just in some consultant's PowerPoint deck.
The beauty of this approach is that it acknowledges something most vendors won't tell you: you don't need (and probably can't afford) to train your own LLM(s) for your needs. Instead, use the LLM for inference, and build smart cognitive architecture around it.
But Wait - There’s More…
Looking at the top of our diagram, you'll spot something that separates proper enterprise AI agents from the chatbot brigade. This is where we get into the real meat of human-AI collaboration.
Direct User Interface - Keeping Humans in the Loop
Let's talk about Slack for a minute, because it's an absolute game-changer for AI agent interfaces.
First off, it's where teams of people already live. Over 750,000 companies use it. But more importantly, it's perfect for that back-and-forth between humans and AI that you actually need in the real world:
Agents can ping you when they're stuck or need clarification
You can monitor task progress in real-time through updates
You can jump in and course-correct if something's going sideways
Multiple humans can collaborate with a single agent at the same time
And because it's all happening in Slack, you've got a complete audit trail of every interaction, decision, and adjustment.
There’s one particular superpower Slack has that I use with my framework that really sets it apart and that’s Slash Commands. These little shortcuts are brilliant for interacting with an agent to see what it’s working on and how far it’s got with an objective, as shown below.
Scheduling - The Autonomous Bit
Your agent isn't just sitting there waiting for commands like some over-caffeinated intern. Being able to operate an agent via scheduling means it can:
Run automated processes at specific times
Handle recurring tasks without human prompting
Manage its own workload
Coordinate with external systems
And crucially, when something goes wrong (because something always goes wrong), it knows it can pop up in a Slack channel and ask for help. No silent failure nonsense that plagues most automated systems here.
External Systems Integration - Making Things Useful
Finally, we've got the integration with your existing enterprise systems. This isn't just about having an API - it's about proper, robust integration that works in the real world. Your agent needs to play nice with your CRMs, ERPs, and whatever other acronyms are keeping your lights on.
It also means working with external systems as a human would. I like to schedule agent tasks that run periodically over time and give me information that fits my workflow. Sometimes, that means something as boring as communicating with me via email.
Human-AI Collaboration Done Right
The beauty of this three-pronged approach is that it creates a proper collaborative environment. Your agent isn't just a black box doing its own thing - it's a team member that knows when to work autonomously and when to ask for help. It's like having a really efficient colleague who never sleeps, doesn't mind doing the repetitive bits, but knows their limitations and isn't too proud to ask for clarification.
This is what separates proper enterprise AI agents from the toys you see being hawked on Twitter. It's not just about having a clever conversation - it's about getting actual work done, whether that's autonomous overnight processing or active collaboration with your team during the day.
Curious About What GenAI Could Do for You?
If this article got you thinking about AI agents and their real impact, you’re not alone. Many readers are exploring this new frontier but struggle to separate reality from hype.
That’s exactly why I built ProtoNomics™—a risk-free way to validate GenAI feasibility before you commit resources. No hype. No sales pitch. Just data-driven insights to help you make an informed decision.
If you’re interested, I now run a limited number of GenAI Readiness Assessments each month. If you'd like to see what this technology could do for your business, you can Learn More Here
Or, if you're “just here for the tech” the next article in the series is 👇
Next Time on The Anatomy of an AI Agent
Part 2: How AI Agents Think
Next week we'll rip apart the my agent's cognitive stack. We'll look at how short-term memory works alongside long-term vector storage to create an actual working memory system. We'll explore how agents use reflection to improve their performance, and how they maintain their own journal of experiences. Plus, we'll dive into the guardrails system that keeps them from going off the rails (pun absolutely intended).
If you think building an agent is just about chaining some LLM calls together, you're in for a proper wake-up call because I’ll show you that the real magic isn't in the LLM at all. It's in everything that goes on around it.
Until the next one, Chris.
Enjoyed this post? Please share your thoughts in the comments or spread the word by hitting that Restack button.
I thought I had a good surface level understanding of agents but this post expanded my mind. Thanks for doing this series!
Finally, some applied knowledge! Looking forward to the series Chris.