You May Not Be Building an AI Agent

If you listen to industry commentary, you would think that the answer to every technical problem your business might run into is to write an AI agent. Product launches, demos, vendor roadmaps, they all emphasize autonomy. Conferences highlight agents coordinating tools, agents coordinating other agents, and agents coordinating entire workflows.

Thing is, most organizations aren't building agents.

Most organizations shouldn't be building agents.

That's not about ambition or sophistication. It’s about architecture, risk, and, frankly, purpose. Understanding what you are actually building, and what you are not, is one of the most leverage-heavy decisions in modern AI engineering.

Why everything suddenly looks like an agent

“Agent” has become shorthand for “real AI.” In marketing and research, it signals autonomy, reasoning, and long-term planning. In practice, it has become an overloaded term used to describe:

any system that calls an LLM more than once (think chatbot)
any system that uses tools
any system that chains prompts
any system that feels “smart”

Teams assume that moving from a demo to production requires building an agent, and vendors encourage that assumption because agents sound strategic, differentiated, and defensible.

But most production systems don't need autonomy. They need structured reasoning inside deterministic boundaries.

Let's look at what this has to do with agents (and what it doesn't).

What an AI agent actually is (in architectural terms)

An AI agent is not defined by how many model calls it makes. It is defined by who owns control flow.

An agent is a system where the model is allowed to:

decide which step comes next
select tools dynamically
loop until it decides the task is complete
change strategy mid-execution

That is a fundamentally different architectural pattern from a pipeline or workflow.

In traditional software, control flow lives in code. In agents, control flow is partially delegated to a probabilistic system.

That delegation is the entire point, and the entire risk.

Three shapes of AI systems (and why the distinction matters)

Once you strip away marketing language, most AI-enabled systems fall into one of three architectural shapes. The confusion around “agents” comes from the fact that these shapes often get lumped together, even though they behave very differently once they’re deployed and maintained.

Understanding the distinction isn’t academic. Each shape carries different assumptions about control, risk, and responsibility — and choosing the wrong one early can quietly make a system harder to ship and harder to trust.

1. Single-step AI systems (function calls with uncertainty)

The simplest shape is also the most common: a single model call that produces a result and returns it.

This includes things like:

summarization
classification
extraction
rewriting
scoring

From an architectural perspective, these systems behave much like traditional function calls — except the function is probabilistic. Control flow is deterministic. The uncertainty lives entirely inside the output.

That distinction is important. Because control flow remains in code, these systems integrate cleanly with existing software practices. You can test them, monitor them, and roll them back without inventing new operational models. Failures tend to be localized. When something goes wrong, you know which call produced the output.

This is why single-step AI is often the fastest way to deliver real value. It augments existing systems instead of reshaping them. And for many use cases, it’s not a stepping stone — it’s the right end state.

Problems start when teams assume this shape is “too simple” and prematurely reach for something more complex.

2. Multi-step workflows (deterministic orchestration)

The second shape introduces coordination, not autonomy.

Multi-step workflows are used when a single model call isn’t sufficient, but the sequence of steps is still known in advance. A typical workflow for these systems is to:

retrieve context
call model
validate output
call downstream tool
aggregate results

The key characteristic in this case is that the structure of the process is explicit and control flow still lives in code. The model contributes content, not decisions about what happens next.

This distinction makes workflows much easier to reason about than agents. You can trace execution paths. You can define clear failure modes. You can decide exactly where validation happens and what triggers retries or escalation. From an operational standpoint, workflows still fit comfortably inside traditional SDLC practices.

In practice, this is where the majority of production AI systems belong — even ones that are often described as “agentic.” They may use multiple model calls and tools, but the system itself is not autonomous. It is orchestrated.

Teams often underestimate how far workflows can go. They reach for agents because the system feels complex, when what they actually need is clearer structure.

3. Agents (delegated control flow)

Agents are different not because they use more AI, but because they change who is allowed to decide.

In an agent-based system, the model is allowed to determine what happens next at runtime. It may choose which tool to call, whether to retry, whether to loop, or when a task is “done.” The execution path is not fully knowable in advance.

That autonomy is powerful — and expensive.

When you delegate control flow to a probabilistic system, you accept non-deterministic behavior by design. Failures become harder to reproduce. Testing shifts from deterministic correctness to statistical confidence. Observability stops being optional. Governance moves from pre-approval to monitoring and intervention.

None of this makes agents bad. But it makes them organizationally demanding.

Agents are appropriate when the problem space is genuinely open-ended — when the set of possible next steps can’t be enumerated ahead of time, and when human-like exploration or adaptation is required. That’s a much narrower category than most teams assume.

When agents are appropriate — and when they aren’t

So what kinds of problems actually justify an agent? We can talk about all kinds of impressive-sounding scenarios, but it all comes down to the underlying requirement that makes an agent necessary: runtime decision-making in an open-ended space.

If you don't need that, an agent is usually the wrong abstraction, no matter how sophisticated the use case sounds.

Use cases that actually justify agents

Agents are appropriate when the system has to explore, adapt, or decide between many possible next steps without those steps being fully specifiable in advance.

A classic example is complex task execution across heterogeneous systems. Imagine a system responsible for investigating an operational issue across logs, metrics, tickets, documentation, and live systems. The exact sequence of actions depends on what it discovers along the way. Hard-coding that logic would be brittle and incomplete. An agent can reason about partial information, decide what to check next, and adjust its strategy as new signals appear.

Another valid case is research-style problem solving, where the goal is not a single deterministic output but a synthesis of information across many different sources. Market analysis, competitive research, or technical investigation often fall into this category. The agent’s value comes from deciding which thread to pull next, not just generating text.

Agents can also make sense in long-running, human-in-the-loop workflows. For example, drafting and refining a complex proposal where requirements change mid-process, or coordinating actions across teams and systems over time. In these cases, autonomy is constrained but real: the agent manages progress, context, and next steps while humans remain in control of high-risk decisions.

In all of these scenarios, the defining feature is not sophistication, it’s uncertainty about the process itself.

Use cases that are often mistaken for agent problems

What we see in the wild is that a lot of common AI use cases look “agentic” on the surface but don’t actually require autonomy.

Take customer support automation. Most support workflows follow known patterns: retrieve relevant information, generate a response, possibly escalate. The branching logic can be expressed deterministically. Introducing an agent often makes these systems harder to test and reason about without improving outcomes.

The same is true for document processing pipelines, most of which include summarization, extraction, classification, compliance checks. Yes, these use an LLM, and yes, these are multi-step problems. But each LLM call is isolated, and the steps are known. A workflow is sufficient, and very often superior.

Another frequent mismatch appears in internal productivity tools. Teams build “agents” to write emails, generate reports, or prepare presentations. In reality, these systems are performing structured transformations with light iteration. The agent framing adds complexity without adding meaningful capability.

In these cases, what teams usually need is better orchestration, validation, and integration — not autonomy.

A useful litmus test for real-world scenarios

A simple way to evaluate a proposed use case is to ask:

If we had to write down all possible next steps in advance, could we do it without contorting ourselves?

If the answer is yes, you’re probably looking at a workflow problem.

Another helpful question is:

Would we feel comfortable explaining exactly what this system can do, and under what conditions, to a skeptical stakeholder?

If the answer is no, the issue is rarely “we need an agent.” More often, it’s that the problem hasn’t been constrained enough yet.

Agents don’t replace clarity. They demand more of it.

Why misclassifying use cases causes real harm

Treating non-agent problems as agent problems has predictable consequences.

Systems become harder to test because execution paths aren’t explicit. Failures become harder to reproduce. Teams spend time debugging “reasoning” when the real issue is missing structure. Over time, confidence, not in the model, but in the system as a whole, erodes.

Worse, autonomy starts to feel risky rather than empowering. Teams respond by locking systems down, freezing behavior, or abandoning them entirely.

This is not a failure of AI. It’s a failure of matching the abstraction to the problem.

The inverse mistake: avoiding agents when they’re warranted

The opposite mistake also happens, though less often: teams avoid agents even when the problem clearly requires adaptive behavior.

This usually shows up as sprawling, fragile workflows with dozens of conditionals, retries, and special cases. The code becomes a poor imitation of reasoning. Each new edge case adds complexity, and the system becomes harder to evolve.

In these situations, the resistance to agents is often cultural rather than technical — fear of unpredictability, lack of observability, or unclear accountability. Those concerns are valid. But the answer is not to contort workflows beyond recognition. It’s to introduce constrained autonomy with the right guardrails.

Matching use cases to system shape

Seen clearly, the alignment looks like this:

Known steps, known outcomes → single-step AI or workflows
Known goal, unknown path → candidate for an agent
High-risk actions, strict boundaries → workflows with validation
Exploratory or investigative tasks → agents with constraints

The important thing is not memorizing categories. It’s recognizing that autonomy is a cost you pay to handle uncertainty. If the uncertainty isn’t real, the cost isn’t worth it.

I'm not trying to discourage you from using agents. Agents are powerful when they’re solving the right kind of problem. They are liabilities when they’re used as a shortcut around design work or as a proxy for sophistication.

A safer default: deterministic structure first

For most teams, the safest and most effective path is not to start with autonomy, but to earn it gradually. AI systems don’t need to be fully agentic to deliver value. In fact, the majority of successful systems never become agents at all.

A more realistic progression begins with single-step AI, where a model augments a specific function (ie summarization, extraction, classification) without altering the surrounding system. When a single call is no longer sufficient, teams move to explicit workflows, coordinating multiple steps with clear control flow and well-defined failure modes. Only when those workflows begin to break down because the system genuinely can't know the next step in advance does autonomy become a reasonable consideration.

Think of it as an incremental delegation of control.

You wouldn’t give an intern production root access on their first day. (Or at least, I hope you wouldn't.) You’d start with bounded responsibility, observe behavior, and expand access over time. A probabilistic system deserves the same discipline. Autonomy is something you grant deliberately, not something you assume by default.

What constrained agents actually look like

Even when autonomy is warranted, unconstrained agents are rarely appropriate. The agents that succeed in production are not free-form problem solvers. They are carefully bounded systems designed to operate within explicit limits.

In practice, constrained agents tend to share a few characteristics. They operate inside known state machines, so while they may decide what to do next, the space of possible actions is controlled. Their access to tools is limited and intentional, rather than open-ended. They validate outputs before triggering side effects, and they escalate to humans when decisions cross predefined risk thresholds. Crucially, they log every decision path so behavior can be inspected, understood, and corrected.

What this means architecturally is that these agents are not replacements for workflows. They are agents embedded within workflows. The surrounding structure provides guardrails; the agent provides adaptability where deterministic logic would become brittle.

That distinction is easy to miss, and expensive to ignore.

How this affects your SDLC

Choosing whether or not to build an agent has real consequences for how your software development lifecycle functions.

With workflows, familiar practices still apply. Unit and integration tests remain meaningful because execution paths are deterministic. Staging environments behave similarly to production. Rollbacks are straightforward because behavior changes can be reversed in predictable ways.

Agents change that equation. Testing becomes distributional; you’re no longer validating a single expected output, but a range of acceptable behaviors across many runs. Staging environments have to simulate probabilistic behavior to be useful. Rollback is no longer just about reverting code; it’s about reverting behavior, which may involve prompts, state, or model configuration.

This isn't better or worse, but it is fundamentally different. And organizations that adopt agents without acknowledging this shift often find themselves with systems that don’t fit their existing delivery and governance practices.

That mismatch is where many AI initiatives quietly stall.

The psychological pressure to build agents

There’s also a social dimension to this decision that technical discussions often ignore.

“Agent” has become a status signal. It implies modernity, sophistication, and strategic relevance. Saying “we’re building workflows” can feel behind the curve, even when workflows are the correct engineering choice.

That pressure is real, and resisting it is a strategic act.

Choosing reliability over narrative is not a lack of ambition. It’s a commitment to shipping systems that can be understood, trusted, and evolved. The most effective AI teams are often the least performative about how “agentic” their systems are.

What this unlocks when you admit you’re not building an agent

Once teams stop defaulting to agents, a lot of friction disappears.

Architecture simplifies because control flow is explicit. Testability improves because behavior is easier to reason about. Governance becomes tractable because risk boundaries are clearer. Tool selection becomes constrained instead of overwhelming, because entire categories of tools are no longer relevant.

Most importantly, you regain deterministic leverage over probabilistic components. You decide where uncertainty can exist, and where it can't.

That, more than any specific framework or tool, is the core engineering challenge of AI systems.

‍

You May Not Be Building an AI Agent. And That’s OK.