The agent control plane is the product
Most agent rollouts start with a model conversation. Which model is smarter. Which coding agent feels faster. Which prompt gets a cleaner diff.
Those questions matter, but they are not the production question.
The production question is uglier and more useful: what surrounds the agent when it acts?
If the answer is “a prompt and a hopeful engineer”, the team is still at demo maturity. A production agent needs a control plane: scope, identity, tool boundaries, evals, observability, approvals, and rollback. Without that, the agent may still be impressive. It is just not something you can scale with confidence.
This is the overlap between the two books I am selling now. Claude Code: Building Production Agents That Actually Scale deals with the engineering loop: task contracts, MCP, evals, observability, cost controls, review packets, and rollback. Securing Enterprise AI Agents deals with the authority loop: identity, delegated permissions, RAG and MCP boundaries, audit evidence, policy gates, and incident readiness. If you own both problems, the Enterprise AI Agents in Production bundle is the simplest place to start.
Prompts do not carry production responsibility
A strong prompt can make an agent behave better for one run. It can ask for a plan, limit the file list, require tests, and tell the agent to stop before risky actions.
That is useful. It is not enough.
A prompt is an instruction inside the run. A control plane is the system around the run. The difference matters when the agent forgets, improvises, hits a tool error, or finds a tempting shortcut. Production systems cannot rely on the agent remembering the boundary. They need the boundary to exist outside the agent.
For Claude Code, that means the task contract is written down before the run starts. The working directory is deliberate. MCP tools are scoped. Expensive loops have stop rules. The output includes a review packet alongside the diff.
For enterprise agents, the same principle becomes security language. The agent has a named identity. Its permissions are delegated, not borrowed from a human. Retrieval respects source permissions. Tool calls leave evidence. Policy decides what the agent may do without a person.
Different vocabulary. Same problem.
The seven checks I would run before scaling an agent
Before giving an agent more autonomy, I would ask seven questions. Not because seven is magic, but because these are the places agent programs usually leak risk.
Scope: What may this run read, write, call, or change?
Identity: Who does the agent act as in the audit trail?
Tools: Which MCP methods, APIs, and shell commands are allowed?
Evals: What catches a bad plan or bad output before it ships?
Observability: What record explains the run after the fact?
Approvals: Which decisions require a human gate?
Rollback: How do we undo the change if the agent was wrong?
If you cannot answer those, do not scale the agent yet. You can still experiment. You can still learn. But do not pretend you have an operating model.
The weak answer is usually obvious. “It uses my credentials.” “It can call all the tools because we have not split the permissions yet.” “We look at the logs if something goes wrong.” “Rollback is just git revert.” Those answers may work for a small code edit. They do not hold when the agent touches customer data, release pipelines, security policy, billing, or production operations.
The control plane should be boring
The best agent control plane does not feel futuristic. It feels like disciplined engineering.
A task has an owner. The agent has a scoped identity. Tools have names, permissions, and reasons. Risky actions stop for approval. Evals run before merge. The run record captures what the agent saw, what it changed, what it tried, what failed, and how a human reviewed it. Rollback is part of the plan before anyone needs it.
That sounds boring because it is. Boring is good here. I have spent enough time around financial-services systems to trust boring controls more than clever demos. When something breaks, nobody asks whether the prompt was elegant. They ask who approved the action, what the agent was allowed to touch, what evidence exists, and how quickly the team can reverse the damage.
A control plane answers those questions before the incident review starts.
Why this becomes a buying problem
Teams often buy or adopt the agent first, then discover the operating work afterward. That order creates pressure. The demo is already popular. Engineers want more access. Leaders want speed. Security arrives with uncomfortable questions and gets framed as the blocker.
That is backwards. The control plane is not a brake on the agent. It is what lets the agent leave the playground.
If the control plane is missing, every increase in autonomy increases anxiety. If the control plane is visible, autonomy can grow in smaller, safer steps. The team can say, “This class of change is safe to automate because the evidence says so.” It can also say, “This class of change still needs approval because the blast radius is too high.”
That is the line between agent enthusiasm and agent operations.
A practical next step
Take one agent workflow you already run. Write down the seven checks from this post. Do not redesign the whole platform. Just answer them for that workflow.
If the gap is mainly in the engineering loop, start with Claude Code: Building Production Agents That Actually Scale. Kindle readers can go straight to Amazon: get the Claude Code book on Amazon Kindle.
If the gap is identity, governance, MCP security, RAG boundaries, audit evidence, or policy, read Securing Enterprise AI Agents or go directly to Leanpub.
If your team owns both the delivery loop and the security loop, get the Enterprise AI Agents in Production bundle. It connects the work engineers need to ship useful agents with the evidence security and risk teams need before autonomy becomes liability.