Skip to content
~/ tommybuilt .dev

// blog /

AI Coding Agents Are Fast. They Are Also Confidently Wrong.

by Tommy
  • #ai_coding
  • #claude_code
  • #solo_dev
  • #debugging

AI coding agents are the reason one person can run a portfolio this size. I do not hand-write most of my code, I never have, and I am not romantic about it. But anyone telling you these tools simply build the thing for you is selling you something. They are fast, and they are confidently, fluently wrong on a regular basis. The entire gap between those two facts is where the real skill now lives.

Let me make that concrete with actual scars from actual projects.

The wrong environment. An agent generated a database migration and I ran it against the wrong project. Two projects, two databases, and the SQL went into the one it did not belong in. It happened to do no real damage, because the migration used safe “create only if it does not already exist” logic, so it just left a few orphan tables sitting in a project that did not want them. But the lesson was sharp: the agent has no idea which environment you are looking at. You do. Check the project name at the top of the screen every single time, before you run anything. The tool will not catch this for you.

The build-time trap. A static frontend freezes its environment variables in when it builds, not when it runs. I updated some values, redeployed, and the live site kept using the old ones, because the old ones were baked into a build from before the change. Everything I could see was correct. The live build was just frozen in the past. That is not a bug in any normal sense. It is a property of how the thing works, and an agent will happily help you change the values without ever mentioning that the change will not take effect the way you expect.

The redirect loop, which is the one that actually taught me something structural. An agent added HTTPS and canonical-host redirect logic to a site’s middleware. It wrote that logic as if the application received public traffic directly. It does not. It sits behind a proxy that rewrites the host and the protocol before the app ever sees the request. So the redirect logic compared against the wrong values, decided every request needed redirecting, and put the site into an infinite loop. The site was down.

Here is the part that matters. The agent reported that work as verified. Its idea of verification was that the code compiled and the build passed. Neither of those things touches a live URL. The site was broken, and the report said success, and those two facts coexisted comfortably because the verification step was checking the wrong thing entirely.

That is the real lesson, and it is now a rule. A verification step that does not hit the live thing is not verification. It confirms the code built. It does not confirm the site works. Those are different claims. Anything that can change an HTTP response, middleware, redirects, headers, now gets a mandatory live check against the real domain after deploy, not after build. The build passing is not the finish line. It is barely the starting line.

The rest of the failure modes are quicker to list, because they rhyme. Agents assume patterns that are not in your project: a common database table that does not actually exist in your schema, the file layout of a framework you are not using. They redesign interface elements you explicitly told them to leave alone. They cite outdated documentation. They add features nobody asked for and call it initiative.

The fix for all of it is not waiting for better agents. The fix is better prompts and better constraints. The prompt is the actual engineering work now. Role, context, the exact files in play, an explicit list of what not to touch, the current stack, real acceptance criteria, and verification steps that check the live result. A vague prompt produces chaos at high speed. A precise one produces genuine leverage. The agent is exactly as good as the constraints you hand it, and not one bit better.

So I hold both ideas at once, because both are true. These tools are a real superpower, the thing that makes a one-person portfolio possible at all. They are also a confident liar that will tell you the site works while the site is on fire. Treat them as only the first thing and you will get burned. Treat them as only the second and you will move too slow to matter. The skill is never letting go of either.

// keep reading