The Hard Part of AI Agents Is Not the Code
Code is becoming cheaper. Taste, boundaries, memory, permissions, and judgement are what make AI agents safe enough to trust.
The strange thing about building with AI agents is that the code is not the hard part anymore.
Not always, anyway.
The hard part is taste.
Boundaries. Memory. Knowing what should happen automatically and what absolutely should not.
I have been wiring more of my work through agents recently: lead capture, scheduled jobs, internal dashboards, LinkedIn drafting, ClickUp bits, browser workflows, and little operational loops that run quietly in the background.
Some of it feels genuinely powerful.
Some of it feels like giving a caffeinated apprentice the keys to the server room.
The model is rarely the whole problem
It is tempting to think the difference between a useful agent and a dangerous one is simply the model. Better reasoning, bigger context, stronger tool use, fewer hallucinations.
Those things help. Of course they do.
But the surrounding system matters more than people want to admit.
Does the agent know the business context? Does it preserve human decisions? Does it keep public actions approval-gated? Does it verify before reporting success? Does it understand that “this lead is for Algarve Ltd” is not a minor detail?
Does it know when a task is actually done, rather than when it has produced an impressive paragraph about being done?
That last one is annoyingly important.
The product category is changing
The future I am interested in is not really “chatbots that answer questions.”
It is small, persistent systems that understand enough of your working life to reduce the drag.
Not replace judgement.
Reduce the admin around judgement.
That feels like a very different product category. Less magic assistant. More operational nervous system.
A good nervous system does not ask you to consciously manage every signal. It notices things, routes things, filters things, and only screams when screaming is useful.
A bad one makes the whole body twitch.
The boring parts matter
Memory matters. Permissions matter. Approval states matter. Logs matter. Retry behaviour matters. The ability to say “stop, not that” matters.
These are not demo features. They are what separate a useful system from a very confident mess.
AI agents will get easier to build. That is obvious now.
Making them safe enough to trust is the real work.