I write about AI after the demo starts looking convincing: production engineering, secure AI agents, Claude Code, LLM observability, evals, governance, financial-services controls, and the uncomfortable gap between a model release and a system people can trust.
The common thread is controlled autonomy. What can the system see? What can it change? Who reviews it? What evidence remains when something breaks?
Not every useful AI article needs to point straight at a book. Some pieces are here because the topic matters: model releases, vendor churn, AI interfaces, adoption habits, organisational risk, and the way the industry keeps mistaking demos for direction.
I will keep a mix: practical production AI, security and governance, Claude Code field notes, and broader AI commentary when the news cycle exposes something worth saying.
Claude Code and other AI coding agents are already useful. The harder question is what happens when they meet real repositories, review habits, permissions, tests, and production risk.
The Claude Code review packet I want before approving agent work
A Claude Code diff is not enough evidence for production review. Ask for the objective, permission boundary, tool trace, tests, failures, cost, and rollback path before approving agent work.
If a Claude Code agent can change production-shaped code, the prompt should say how to undo the work. Rollback is not paperwork after the diff. It is part of the task boundary.
Claude Code Agent Cost Loops Start as Workflow Bugs
Claude Code cost problems usually start before the model call: vague tasks, wide-open tools, repeated repo exploration, and no stop rule. Treat spend as a workflow bug, not just a pricing problem.
Production Claude Code evals should not begin with abstract benchmarks. Start with the agent runs that scared you, reduce them into replayable cases, and use them to tune permissions, prompts, tools, and review gates.
Claude Code Permissions: The Production Mistake That Bites Later
Claude Code permission modes can look safer than they are. The real production risk lives in tool scope: paths, network access, secrets, deploy files, and what reviewers actually approve.
If a Claude Code agent changes production code, the useful artifact is not the chat transcript. It is a flight recorder: intent, boundaries, commands, diffs, tests, approvals, and rollback notes.
Claude Code Is Not the Product: The Production Loop Is
A Claude Code demo is easy to love.
You describe a feature, the agent edits files, runs commands, fixes its own mistakes, and suddenly the repository has moved. The first time you see it work, it feels like software engineering has skipped a generation.
But the demo is not the hard part.
The hard part is making that same capability safe enough, repeatable enough, and observable enough that you would trust it inside a real engineering workflow. That is where the actual product begins.
...
Vibe Coding: I Am No Longer a Software Engineer. I Am a Product Engineer.
It was a Tuesday, the kind where the London drizzle seemed to seep into your bones, even if you were miles away in Basingstoke (my home town). I was tasked with a rather daunting project, or not really as you will find out: designing and deploying a real-time Sentiment Analysis product for a major telecommunications company, and, oh, they needed it yesterday.
Or so it felt.
Now, usually, this would mean weeks of planning, coding, debugging, and endless cups of tea. But this time, something was different. I decided to try something I’d been experimenting with – something I now call “Vibe Coding.”
...