If the agent opens a PR, keep a control record

A pull request from Claude Code can look reassuringly normal.

There is a branch. There is a diff. The tests pass. The summary says what changed. A busy reviewer sees familiar ceremony and may treat the work like any other pull request.

That is the trap. An agent generated PR is a code review object and a delegation record. The useful question is bigger than “does this diff look right?” The reviewer also needs to ask, “what did we allow the agent to do on the way here?”

That is where Thomas De Vos’s two books meet. Claude Code: Building Production Agents That Actually Scale is about making Claude Code useful inside real engineering work: scoped tasks, tool discipline, review packets, evals, observability, cost control, and rollback. The public purchase path goes through the Kindle edition from the book page, with other formats linked for readers who prefer them. Securing Enterprise AI Agents is about the authority around agents: identity, permissions, MCP and RAG boundaries, policy gates, audit evidence, and incident response. The Enterprise AI Agents in Production bundle is for teams that need both sides in one operating model.

Agent pull request control record

The diff is the last page of the story

A human reviewer can usually reconstruct part of a normal pull request. They can see the files touched, infer intent from the commits, ask the author why a choice was made, and push back if the branch crosses scope.

With an agent, that conversation gets weaker unless the team designs for it.

Claude Code might have inspected files that never changed. It might have followed an error into another module, read logs, queried documentation, called MCP tools, generated several failed patches, or skipped an important test because the prompt gave it too much freedom. None of that is visible in the final diff unless the workflow captures it.

A clean PR can still hide a messy route.

The control record does not need to be heavy. It needs to answer the questions a reviewer and a security owner will ask once the agent touches something serious:

Who delegated the task?
What was the task contract?
Which files and data sources were in scope?
Which tools and MCP servers could the agent use?
Which commands did it run?
What did it try that failed?
What evidence supports the patch?
What risk remains?
Who owns the decision to merge?
How do we roll it back?

That list is not bureaucracy. It is the minimum viable receipt for delegated engineering work.

Start the record before the agent runs

The worst time to create the record is after the PR exists. By then the agent has already spent the authority.

Start with a task contract:

Task: fix the flaky invoice export retry test.
Allowed reads: invoice export package, retry helper, related test files, pasted CI failure.
Allowed edits: retry helper and invoice export tests only.
Allowed commands: targeted package tests and lint for changed files.
Blocked: production data, migrations, deploy config, customer examples, write capable MCP tools.
Evidence required: failing test, passing test output, files changed, assumptions, rollback note.
Stop if: the fix needs schema changes, payment behavior changes, or another service boundary.

That prompt is useful for engineering because it keeps Claude Code focused. It is useful for security because it states the delegated authority before work begins.

The difference matters. A control written after the fact often becomes a story. A control written before the run becomes a boundary.

MCP changes the review question

A repo only agent run can still go wrong, but the reviewer usually sees most of the damage in the patch. MCP changes that.

If the agent can read tickets, logs, traces, internal docs, cloud resources, workflow tools, or policy systems, the PR is no longer the only artifact worth reviewing. The agent may have touched context that never appears in Git.

This is why connector names are not enough. “Read Jira” can include customer escalation detail. “Query logs” can include production like telemetry. “Search docs” can expose incident notes or architecture decisions. “Open PR” can move the agent into CI and release flow.

For every tool in the run, record the business capability, not only the technical name:

Tool: jira.search
Capability: read bug reports and customer escalation context
Allowed for this run: no, unless the human pastes a redacted ticket excerpt

Tool: logs.query
Capability: inspect production like telemetry
Allowed for this run: no

Tool: repo.read
Capability: inspect scoped source files
Allowed for this run: yes, invoice export package only

That level of detail keeps the team honest. It also makes approval easier later because the reviewer can see which doors stayed closed.

The review packet should travel with the PR

The review packet is the part developers will actually use if it stays short.

A useful default looks like this:

Task
Scope agreed before run
Files inspected
Files changed
Commands run
Tests passed and failed
Tools or MCP servers used
Assumptions
Security sensitive areas touched
Residual risk
Rollback note
Human approval needed

Do not bury this in a separate governance system that reviewers never open. Put the short version in the PR description. Link the longer run record if the work is high risk.

For low risk work, a few lines are enough. For anything touching auth, payments, billing, customer data, permissions, migrations, deployment, regulated records, or write capable MCP tools, require a fuller record and a named owner.

This is also where the Claude Code book and the security book support different readers in the same meeting. Engineering wants a review packet that helps reviewers decide whether the patch is good. Security wants evidence that the agent stayed inside its authority. A single record can serve both if the team agrees the shape up front.

The approval should name the risk being accepted

A normal merge approval often means “the code looks good enough.” For an agent generated PR, approval should also mean “the route to this code was acceptable.”

That does not require drama. It can be one plain line:

Approved to merge. The agent stayed inside the scoped package, used no external MCP tools, ran the targeted retry tests, and the rollback is to revert this PR.

Or, for a higher risk change:

Approved by billing service owner. The agent touched retry behavior but not payment calculation, migrations, customer data, or deploy config. Full regression still required before release.

That line gives the team something better than vibes. It names the boundary, the evidence, and the remaining owner decision.

What to buy if this problem feels familiar

If your team is mostly struggling with the engineering side, start with the Claude Code book. It is the better fit when the pain sounds like this:

Our prompts are too vague.
The agent over edits.
Reviewers do not trust the summary.
We do not have a repeatable rollback habit.
We need better task contracts, evals, hooks, and review packets.

Start with the Claude Code book. The book page routes public readers to the Amazon Kindle edition for the main purchase path.

If your team is mostly struggling with security, risk, or enterprise rollout, start with Securing Enterprise AI Agents. It is the better fit when the questions are about delegated authority, MCP boundaries, RAG governance, audit evidence, policy gates, incident handling, and who is allowed to approve what.

Read Securing Enterprise AI Agents.

If you are building a serious rollout, the bundle is the cleanest answer. A production agent program needs the delivery loop and the authority model together. Otherwise the team gets fast patches that security cannot defend, or careful controls that engineers work around.

Get the Enterprise AI Agents in Production bundle.

A control record template you can copy

Use this for the next agent generated PR:

Agent PR control record

Delegating human:
Task contract:
Allowed reads:
Allowed edits:
Allowed commands:
Allowed tools and MCP servers:
Blocked tools, data, and systems:
Files inspected:
Files changed:
Commands run:
Tests and checks:
Failed attempts or rejected paths:
Security sensitive areas touched:
Assumptions:
Residual risk:
Rollback note:
Approval owner:
Merge decision:

The template is deliberately boring. Boring is good here. It gives the reviewer the route to the diff, gives security the evidence of delegated authority, and gives the team a way to improve the next run.

If the agent can open a PR, it needs more than a good summary. It needs a record of what it was allowed to do, what it actually did, and who accepted the result.