The agent rollout needs one operating model

Most agent rollouts split too early.

Engineering talks about speed. Security talks about control. Risk talks about evidence. Leadership wants to know when the pilot becomes useful enough to justify the cost. Everyone is looking at the same system, but each group ends up with a different playbook.

That split is where production agent work gets messy.

Claude Code can help a team move faster through real engineering work: task contracts, scoped repo changes, tests, tool receipts, review packets, evals, and rollback notes. Enterprise AI agents bring a different set of questions: identity, delegated authority, MCP boundaries, RAG access, approval gates, audit trails, and incident response.

Those are not separate conversations for long. If the same agent can read a repository, call tools, inspect tickets, summarize incidents, or influence a pull request, the delivery loop and the security loop are touching the same thing.

That is the buying reason for Thomas De Vos’s two books. Claude Code: Building Production Agents That Actually Scale gives the engineering operating loop. For most public readers, the Claude Code path starts with the Kindle edition through the book page. LeanPub stays useful for readers who prefer update-friendly formats. Securing Enterprise AI Agents gives the control model around agent authority. The Enterprise AI Agents in Production bundle is for teams that need both at once.

One operating model for production AI agents

The dangerous handoff is between “useful” and “allowed”

A small Claude Code trial can be informal. A developer asks for a test, a refactor, or a narrow bug fix. The agent proposes a patch. The human reviews it. Fine.

The handoff changes when the workflow becomes normal team behavior.

Now the agent may have access to internal docs, issue trackers, build logs, MCP tools, local commands, package managers, browser sessions, or pull request automation. The question is no longer only whether the agent produced a good diff. The question is whether the organization knows what it allowed the agent to do.

A useful run needs answers like these:

Who delegated the task?
What was the task contract?
Which files, tools, and data sources were in scope?
Which MCP servers could the agent call?
Which actions required human approval?
What evidence did the run leave behind?
Who owned the final decision?
How would the change be rolled back?

That list feels half engineering and half security because production agent work is both. A prompt that can call tools is not only a prompt. It is a small delegation of authority.

Do not make engineering and security maintain separate truths

Separate playbooks create bad habits.

Engineering may define the agent run like this:

Task: fix the invoice export bug.
Scope: billing/export/** and matching tests.
Checks: targeted unit tests and lint.
Deliverable: patch, summary, review packet, rollback note.

Security may define the same run like this:

Requester: named engineer.
Agent identity: local approved tool profile.
Allowed data: synthetic examples and repo content only.
Blocked: production records, secrets, migrations, deploy config, write capable MCP tools.
Evidence: files touched, commands run, tools called, approvals requested, residual risk.

Both views are useful. The problem starts when they live in different documents, use different language, and get reviewed by different people after the rollout is already under pressure.

Put them in one operating model instead:

Task contract
Permission boundary
Tool and MCP scope
Evidence capture
Patch and tests
Security exceptions
Review packet
Rollback note
Human approval

That is the loop a team can actually run. It gives engineers enough room to use the agent well, while giving security and risk enough evidence to trust or reject the result.

The bundle view is practical, not ceremonial

The word “governance” can make this sound heavier than it is. I do not think a team needs a giant committee before an agent writes a unit test.

I do think the team needs a clear scale of seriousness.

For a docs typo, keep it lightweight. For a small test helper, ask for a short summary and targeted checks. For anything touching auth, billing, customer data, concurrency, permissions, migrations, deployment, internal tools, regulated records, or MCP servers with real business access, require a full review packet and a named approver.

The operating model should make that escalation boring:

Low risk: small local change, no sensitive tools, normal review.
Medium risk: scoped repo change, targeted checks, review packet.
High risk: data, auth, payments, MCP, production-like scripts, named approval and rollback.
Stop: secrets, live customer data, schema change, deploy path, or boundary expansion without approval.

This is where the two books fit together without turning into a sales poster.

The Claude Code book is about making the run work: better task contracts, permission budgets, tool receipts, evals, observability, cost controls, review packets, and rollback habits. The security book is about deciding what the run is allowed to do: identity, delegated authority, RAG and MCP boundaries, policy gates, audit evidence, and incident response.

If your rollout is serious, you need the connection between those two.

MCP forces the issue

MCP is often the moment the operating model stops being optional.

A coding agent with repo access can still cause trouble, but the blast radius is usually visible in the diff. An agent with MCP access may read ticket context, query observability, inspect internal documentation, touch workflow systems, or use tools that feel harmless because their names sound read only.

For each MCP server, ask a blunt question:

What business capability does this give the agent?

Do not settle for connector names. “Read Jira” might mean customer escalation detail. “Query logs” might mean production-like telemetry. “Open a pull request” might mean a path into CI and release flow. “Search docs” might mean access to architecture decisions, incident notes, or regulated process detail.

Once you phrase tools as business capabilities, the operating model gets clearer. Some tools can be default read scope. Some need approval. Some need synthetic data only. Some should be blocked until the team has logging, owner review, and a kill switch.

That is not anti-agent. It is how the agent earns wider access.

What a team should buy, depending on the pain

If the pain is engineering delivery, start with the Claude Code book. It is the right fit when the team is asking questions like:

How do we write task contracts that keep the agent inside scope?
How do we ask for review packets that reviewers will actually use?
How do we design evals from scary runs?
How do we make rollback part of the prompt?
How do we control cost, context, tools, and approval?

Start with the Claude Code book. The Kindle edition is the primary public purchase path through the book page, with other formats linked where useful.

If the pain is security, risk, or enterprise rollout, start with Securing Enterprise AI Agents. It is the right fit when the team is asking:

Which authority did we delegate to the agent?
How should identity and permissions work?
Where do MCP and RAG access need boundaries?
What evidence belongs in the audit trail?
Which actions require approval?
How do we respond when an agent crosses scope?

Start with Securing Enterprise AI Agents.

If the same rollout has both problems, get the bundle. That is the most honest answer for platform teams, enterprise architects, security leaders, and engineering managers who have to make agents useful and defensible at the same time.

Get the Enterprise AI Agents in Production bundle.

The operating model is the product decision

A team does not adopt production agents by choosing a model and hoping the process catches up later. The process is part of the product decision.

Before the rollout expands, decide the loop:

Prompt with a task contract.
Bind the tools and permissions.
Capture evidence during the run.
Review the patch and the route to the patch.
Name the remaining risk.
Write the rollback note.
Require human approval where the boundary says so.

That loop is small enough to use and strict enough to matter. It keeps Claude Code from becoming unchecked speed. It keeps security from becoming paperwork after the fact. It gives leadership a cleaner question than “are we using agents yet?”

The better question is: “Can we show what the agent was allowed to do, what it actually did, and why a human trusted the result?”

If the answer is yes, the rollout has a chance. If the answer is no, buy the operating model before you buy more automation.