// blog / 2026-02-17

AI Coding Agents Are Fast. They Are Also Confidently Wrong.

by Tommy

#ai_coding
#claude_code
#solo_dev
#debugging

AI coding agents are the reason one person can run a portfolio this size. I do not hand-write most of my code, I never have, and I am not precious about it. But anybody telling you these tools just build the thing for you is selling something. They are fast, and they are confidently, fluently wrong on a regular basis. The whole job now lives in the gap between those two facts.

Let me make that concrete with actual scars from actual projects.

The wrong environment. An agent wrote a database migration and I ran it against the wrong project. Two projects, two databases, and the SQL went into the one it had no business touching. It happened to do no real damage, because the migration only created tables if they did not already exist, so it just left a few orphan tables sitting in a project that never asked for them. The lesson landed hard anyway. The agent has no idea which environment you are looking at. You do. Check the project name at the top of the screen every single time before you run anything, because the tool will not catch this one for you.

The build-time trap. A static frontend freezes its environment variables in at build time and ignores whatever you change later at runtime. I changed some values, redeployed, and the live site kept using the old ones, because they were baked into a build from before the change. Everything I could see was correct. The live build was just frozen in the past. Calling that a bug would be generous. That is how the thing works, and an agent will cheerfully help you change those values without ever mentioning the change will not take effect the way you expect.

The redirect loop is the one that actually taught me something structural. An agent added HTTPS and canonical-host redirect logic to a site’s middleware. It wrote that logic as if the app received public traffic directly. It does not. It sits behind a proxy that rewrites the host and the protocol before the app ever sees the request. So the redirect logic compared against the wrong values, decided every single request needed redirecting, and put the site into an infinite loop. The site went down.

Here is the part that stuck with me. The agent reported that work as verified. Its definition of verified was that the code compiled and the build passed. Neither of those things ever touches a live URL. The site was on fire and the report said success, and both of those were true at the same moment, because the verification step was checking something that had nothing to do with whether the site worked.

That turned into a rule I do not break anymore. A check that never hits the live thing only tells you the code compiled, and I stopped treating a green build as evidence of anything except a green build. Anything that can change an HTTP response, middleware, redirects, headers, gets a mandatory hit against the real domain after it deploys, and a green build alone proves nothing. On a good day that build is the starting line.

The rest of the failure modes are quicker to list, because they rhyme. Agents assume patterns that are not in your project, a database table that does not exist in your schema, the file layout of a framework you are not even using. They redesign buttons you specifically told them to leave alone. They cite docs from two versions ago. They add features nobody asked for and call it initiative.

The fix is boring, and it is the actual engineering work now. Better prompts and tighter constraints. Role, context, the exact files in play, a hard list of what not to touch, the current stack, real acceptance criteria, and verification steps that check the live result. Hand an agent a vague prompt and you get chaos at high speed. Hand it a precise one and you get real leverage. It is exactly as good as the constraints you give it and not one notch better.

So I hold both ideas at once, because both are true every single day. These tools are a genuine superpower, the only reason a one-person portfolio works at all. They will also look you in the eye and tell you the site is fine while it is down. Lean on the first idea alone and you will get burned. Lean on the second alone and you will move too slow to matter. The skill is keeping a hand on both.