What security should ask before a Claude Code rollout
Claude Code often enters a company through the engineering side door.
A developer tries it on a branch. A staff engineer uses it to clean up tests. A platform team starts wiring it into repo workflows. The results can be genuinely useful. Then one day the question changes from “does this help me code faster?” to “what exactly are we allowing this agent to do inside the business?”
That is the point where security needs to be in the room. Not to slow everything down. To make the rollout survivable.
If Claude Code can read files, call MCP tools, run shell commands, change code, inspect logs, or influence CI, the rollout is no longer a tooling preference. It is delegated authority. The team needs to know who granted that authority, where it stops, what evidence it leaves behind, and how to recover when the agent is confidently wrong.
Start with authority, not prompts
Most Claude Code conversations start with prompt quality. That makes sense for an individual developer, but it is too small for a team rollout.
Security should start with authority:
Who is allowed to delegate work to Claude Code?
Which account, token, or identity does the agent use?
Which tools can it call without a human approval?
Which actions need a named reviewer?
Who owns the risk when the run crosses the original boundary?
Those questions are boring in the best way. They turn a vague agent workflow into something a team can inspect.
A prompt can say “fix the invoice export bug.” Authority decides whether the agent may read customer examples, edit billing code, run a migration, call an internal MCP server, or touch deploy configuration. Without that boundary, the prompt is doing work it cannot carry.
This is why I argue in the Claude Code book that the operating loop matters more than the model. A good model inside a vague permission model is still a risk.
Ask what the agent can touch
The simplest useful security review is a touch map. Before a team rollout, write down what Claude Code can read, edit, execute, call, and publish.
I would separate it like this:
Read: repos, tickets, logs, docs, source systems, RAG indexes
Write: code, tests, config, docs, tickets, generated files
Execute: test commands, shell commands, package managers, deploy scripts
Call: MCP servers, internal APIs, browser sessions, CI systems
Expose: summaries, diffs, logs, screenshots, external posts, artifacts
That list often reveals the real risk. The scary part is rarely “Claude Code writes code.” The scary part is that a coding task quietly becomes a permissions task, then a data task, then a deployment task.
A team can avoid a lot of pain by saying this before the run starts:
Allowed: read and edit application code under billing/export/** and tests under billing/export/tests/**.
Allowed commands: targeted unit tests and lint only.
Blocked: migrations, customer data, production logs, secrets, deploy config, and write-capable MCP tools.
Stop if: the fix requires a schema change, a wider service boundary, or access to production examples.
That is not theatre. It gives the agent a useful box. It also gives the reviewer a way to spot when the box moved.
MCP changes the security conversation
MCP is where many Claude Code rollouts become more powerful and more dangerous.
A local agent with repo access is one thing. An agent with MCP access to tickets, browsers, databases, internal docs, build systems, observability tools, or workflow APIs is a different class of system. It can gather context faster. It can also cross a boundary faster.
Security should ask a blunt question for every MCP server:
What business capability does this tool give the agent?
Do not stop at the technical description. “Read Jira” is not the risk. “Read customer escalation tickets that include account names and incident detail” is closer. “Open pull requests” is not the full risk. “Create code changes that another automation may merge or deploy” is closer.
For each MCP tool, I would want:
- an owner
- a purpose
- the data it can expose
- the actions it can take
- whether it is read only or write capable
- the approval rule for sensitive operations
- logs that connect tool calls back to the agent run
- a way to disable it quickly during an incident
This is where the two books connect. Claude Code: Building Production Agents That Actually Scale covers the engineering loop around MCP, permissions, hooks, evals, observability, and rollback. Securing Enterprise AI Agents widens that into delegated authority, AgentSecOps, RAG governance, identity, audit evidence, and policy gates.
If engineering and security both own the rollout, you probably need both halves.
Demand a run record, not a cheerful summary
A Claude Code run summary can sound reassuring while still being useless for review.
Updated the export filter, added tests, and verified the suite passes.
That may be true. It is not enough for production adjacent work.
Security and engineering should agree on the minimum run record before the rollout becomes normal. For serious tasks, I want the record to include:
Original task request
Delegating user
Agent identity and permissions
Files inspected
Files changed
Tools and MCP servers used
Commands run
Sensitive actions denied
Approvals requested and granted
Tests run and test gaps
Assumptions
Residual risks
Rollback note
Reviewer decision
The point is not to drown people in logs. The point is to leave enough evidence that someone can reconstruct the run later. If an incident lead asks why a file changed, which tool supplied the context, or whether the agent touched a blocked area, the answer should not depend on one developer remembering Tuesday afternoon.
I wrote about this from the Claude Code side in Claude Code needs a flight recorder. The security version is stricter: if the run changes code, uses business tools, or touches regulated context, the run record is audit evidence.
Make rollback part of permission design
Rollback is often treated as release management. For agentic coding, it should show up earlier.
Before Claude Code edits production adjacent code, ask for the rollback note. If the agent cannot explain how the change would be undone or contained, it has not earned wider authority yet.
A useful rollback note says:
Change: tighten the export filter for EU invoice rows.
Undo path: revert the patch in billing/export/** and redeploy the worker.
Data risk: exports generated during the bad window may need replay.
Trigger: reconciliation mismatch, missing rows, or customer report.
Owner: billing on-call decides whether to revert or pause the job.
Stop before: schema changes, queue config, payment adapters, or production data.
That note does two things. It helps the reviewer understand operational risk. It also limits the agent. If the work now requires a schema change, the run should stop and ask for a new approval instead of quietly expanding the mission.
This is a pattern I covered in Claude Code needs a rollback note before code. Security should care because rollback and permission are connected. If the undo story changes, the authority story changes too.
The rollout questions I would use
If I were reviewing a Claude Code rollout with a security team, I would start with these questions:
- Who can delegate work to the agent?
- Which identity does the agent use for repo, tool, and MCP access?
- Which files, directories, and services are in scope by default?
- Which areas are blocked unless a human approves them?
- Which MCP tools are read only, and which can take action?
- What data can the agent retrieve, summarize, store, or expose?
- Which commands can it run without approval?
- What happens when tests need credentials or production like data?
- What evidence must the agent return before review?
- Where are run records stored, and how long are they retained?
- Who can disable a tool or permission quickly?
- What incident process covers agent caused changes?
- What rollback note is required before high risk edits?
- Which metrics show the rollout is working or getting risky?
- Who accepts residual risk when the agent needs more authority?
The answers do not need to be perfect on day one. They do need to exist. If a team cannot answer them, the rollout is still an experiment, even if the agents are already touching real work.
Buy the books if this is becoming real inside your team
If your immediate problem is Claude Code inside engineering, start with the Kindle edition of Claude Code: Building Production Agents That Actually Scale. It is the practical operating model for task contracts, scoped permissions, MCP, hooks, evals, observability, review packets, cost controls, and rollback.
If your problem is enterprise risk, security review, governance, audit evidence, RAG control, identity, and bounded autonomy, read Securing Enterprise AI Agents.
If the same rollout has to satisfy engineering delivery and security review, get the Enterprise AI Agents in Production bundle. Build the agent loop, secure the authority around it, and keep the evidence before the demo quietly becomes production.