I Shipped Code I Don't Understand — And So Have You

I'm going to say something uncomfortable: I've shipped code to production that I don't fully understand. Not because I'm careless. Because I used AI to write it, the tests passed, and the feature worked.

If you've used Copilot, Claude Code, or Cursor in the last year and shipped anything to production, you've probably done the same thing. We just don't talk about it.

The Uncomfortable Reality

Jake Nations from Netflix gave a talk about this — the honest reality of shipping code you didn't write line by line. It resonated because it describes what's actually happening in engineering teams right now, versus the sanitised version where developers claim they "review everything carefully."

Here's what actually happens. Claude generates a function. You read it. It looks right. The tests pass. You have three other tasks waiting. You ship it.

Did you understand every line? Probably not. Did you understand the intent and verify the output? Probably yes. Is that enough? That's the question nobody wants to answer directly.

The Spectrum of Understanding

"Shipping code you don't understand" isn't binary. There's a spectrum.

At one end: you don't understand the algorithm, the data structures, or the API being called. You just saw it compile and pass tests. That's dangerous.

At the other end: you understand the architecture, the intent, and the expected behaviour, but the specific implementation was generated by AI and you verified it through testing rather than reading every line. That's... probably fine? Developers have been doing a version of this with third-party libraries forever. You don't read the source code of React before using it.

The question is where on that spectrum you are for any given piece of code. And whether you'd know the difference if something went wrong.

What I Do About It

I've developed a few habits that I think make AI-generated code manageable rather than reckless.

Architecture is mine. Implementation can be AI's. I design the system structure, define the interfaces, and make the key decisions about data flow and dependencies. Claude handles the implementation within those constraints. If the architecture is right, an implementation bug is fixable. If the architecture is wrong, no amount of correct implementation saves you.

Tests aren't optional. With AI-generated code, tests serve a different purpose than they do with hand-written code. They're not just verifying correctness — they're my safety net for code I might not fully understand. If the tests are comprehensive, a subtle implementation bug will surface. If the tests are thin, I'm flying blind.

I ask Claude to explain its own code. When I'm shipping something complex — database migrations, authentication flows, anything touching money — I ask Claude to walk me through the code it generated and explain the tradeoffs it made. This serves two purposes: I learn something, and I catch the moments where its explanation doesn't match what it actually wrote.

I know where I'm exposed. I can tell you exactly which parts of my client projects have code I understand deeply and which parts have AI-generated code I've verified through testing but couldn't reproduce from memory. That awareness matters. When something breaks at 2am, I know which files I can debug confidently and which ones will require more careful investigation.

The Industry Needs Honesty

The current discourse about AI coding is dishonest in both directions. The optimists claim AI writes perfect code that just needs a rubber stamp. The skeptics claim real developers would never ship code they didn't write themselves.

Both are wrong. The reality is messy: AI generates code that's usually good, sometimes subtle wrong, and occasionally brilliant. The developer's job is shifting from "write every line" to "understand the system, verify the output, and know where the risks are."

That's not a lesser job. It's a different job. And pretending otherwise — in either direction — helps nobody.

Where I Land

I'm going to keep shipping AI-generated code. The productivity gain is too significant to ignore, and the quality — with proper testing and review — is consistently good enough for production.

But I'm not going to pretend I understand every line. That's the honest position. The code works. The tests pass. The architecture is sound. And if something breaks, I have the skills and the safety nets to fix it.

What I won't do is pretend this is the same as writing it all myself. It isn't. And the sooner the industry has an honest conversation about what "AI-assisted development" actually looks like in practice, the sooner we can build the tools, processes, and expectations that make it reliable.