Claude Code review is too late if permissions are wrong

The most dangerous Claude Code setup does not always look reckless.

Sometimes it looks responsible. There is a detailed instruction file. The team has a few useful MCP servers. The repo is available. The agent asks before the scarier commands. Someone says, “It is fine, we review everything.”

That sentence sounds sensible until you notice the timing problem.

Review happens after the agent has already acted inside the boundary you gave it. If the boundary was too wide, the reviewer is no longer checking a neat patch. They are investigating a larger system action after the fact.

That is why I treat Claude Code permissions as part of the product design, not setup plumbing.

Claude Code permission boundary loop

Human review cannot repair every boundary mistake

Review is still needed. I am not arguing for hands-off agents near production code.

The problem is pretending review can compensate for a vague operating envelope.

If Claude Code was allowed to edit the whole repo, call broad tools, inspect unrelated files, touch configuration, and retry until something went green, the reviewer inherits all of that ambiguity. They have to work out what the agent meant, what it ignored, whether the task widened, and whether some unrelated change is now hiding in the diff.

That is a bad trade. The agent saved time during the run, then pushed the cost into review.

I have seen the same shape in financial-services engineering without agents: weak change boundaries create expensive reviews. Coding agents make the loop faster, but they do not make the control problem disappear.

A better rule is blunt:

If a reviewer cannot understand the boundary, the run was too broad.

Permissions decide the work before the prompt does

A prompt can ask for discipline. Permissions enforce some of it.

Before a run starts, the team should know:

  • which files and directories are in scope
  • which areas are read-only context
  • which commands are allowed without approval
  • which commands always require a human
  • which MCP servers are available for this task
  • whether the agent can touch auth, billing, infra, migrations, secrets, or production data
  • what evidence the agent must leave before the patch is reviewed

This is not about locking Claude Code down forever. That would just teach people to work around the process.

The point is to make autonomy conditional. Start narrow. Watch what the run produces. Widen only when the evidence is boring in the best possible way: small diff, named commands, useful tests, clear risk note, clean rollback path.

If the evidence is messy, shrink the boundary.

MCP turns permissions into capability design

MCP is where this gets serious.

A coding agent with file access can make a bad patch. A coding agent with broad tool access can make a bad patch and change the world around the patch.

That does not make MCP bad. It makes MCP a capability boundary.

I would avoid a generic “MCP allowed” setting for production work. Name the capability instead:

CapabilitySafer starting pointNeeds stronger approval
issue trackerread the ticket and commentscreate, close, or reassign tickets
docs/searchread approved docspublish docs or edit runbooks
CI/CDread build statustrigger deploys or change pipelines
databasequery sanitized dev datatouch production data
cloud toolsinspect non-prod configmutate infrastructure

The question is not “Can Claude Code use this tool?”

The better question is: “Should this task have this capability today?”

That one word, today, matters. A migration task, a docs cleanup, a test repair, and an incident follow-up should not all get the same tool budget.

Use a permission ladder

I like permission ladders because they make access feel earned instead of granted by default.

A simple version looks like this:

Level 0: read-only exploration
Level 1: patch in one narrow area
Level 2: patch plus tests in a known subsystem
Level 3: cross-cutting change with explicit human checkpoints
Level 4: autonomous batch work only after strong evals and rollback discipline

Most teams do not need to start at level three. They start there because the demo was exciting, the tool feels productive, and everyone wants to see what happens.

That is a poor reason to widen blast radius.

Move up the ladder when repeated runs leave evidence you would trust during an incident review. Move down when the run drifts, skips obvious checks, changes surprise files, or hides risk behind a confident summary.

The ladder is not bureaucracy. It is a memory aid for the team when the agent feels almost smart enough to trust.

Almost is where a lot of production trouble lives.

The run record is what changes the next boundary

Every useful Claude Code workflow should leave a run record.

For permission decisions, the record should include:

  1. the original task boundary
  2. files touched and why
  3. commands run
  4. checks passed
  5. checks skipped, with reasons
  6. MCP tools called
  7. assumptions and risks
  8. rollback notes
  9. anything the agent wanted to do but was not allowed to do

That last item is underrated. If the agent repeatedly asks for a tool or file outside the current boundary, the team can decide whether the boundary is too tight or the task is too broad. Either answer is useful.

Without that record, permission decisions become vibes. Someone remembers that the last run “went fine” and the next run gets more room.

That is how teams accidentally promote a demo habit into an operating model.

For a compact version of the checks I would run before giving Claude Code more room, use the Claude Code production checklist.

A practical policy for team adoption

Here is the policy I would start with for a team using Claude Code on real repositories:

Default: narrow write scope.
Default: MCP tools off unless the task names them.
Default: auth, billing, infra, secrets, migrations, and production data require human approval.
Default: no widening access mid-run without recording why.
Default: no merge without a run record and rollback note.

That policy will annoy people a little. Good. The annoyance is small compared with the cost of a broad agent run that nobody can explain two days later.

You can relax the defaults when the workflow earns it. Not because the model improved. Not because a demo went well. Because your own run records show the agent can operate inside a boundary and leave enough evidence for a human to make a good decision.

This is the part of agentic coding that is less fun to tweet about. It is also the part that will separate useful team adoption from expensive theatre.

The real sales pitch is fewer surprises

Claude Code is powerful because it can hold context, edit code, run commands, and work through tasks that used to take a lot of developer attention.

That power is exactly why permissions matter.

A good team workflow should make the boring parts visible: the boundary, the tools, the checks, the risks, and the rollback path. When those are visible, reviewers can spend less time reconstructing the run and more time judging the change.

That is the promise I care about. Not magic autonomy. Fewer surprises.

If your team is trying to move Claude Code from impressive runs into production engineering, I wrote the book for that gap: Claude Code: From Vibe Coding to Production. It covers permission ladders, MCP blast radius, review packets, evals, observability, rollback-first prompts, and safe team adoption.

Build safer Claude Code workflows.
Read the book: Claude Code: From Vibe Coding to Production on Amazon Kindle.
Want the operating checklist first? Start with the free Claude Code production checklist.