Knock Before You Commit

Back when AI coding agents were first getting traction, I shipped a feature I thought was solid. I typed a few sentences into the prompt box. What the feature should do, roughly. I left out the auth flow. I left out what happens to local data on sign-in. I left out the database conventions already in place. The gaps were structural.

The agent did not complain. It did not ask a single question. It filled in the blanks with whatever made sense to its training data. Code compiled. Code ran. But it was wrong. It assumed a different pattern than the one our codebase used, wired into APIs that did not exist, ignored conventions we had spent years building. QA caught it days later. The feature worked on its own terms, but not with our system. I had to restructure the whole thing. Days of rework that could have been avoided if the agent had stopped and said "hey, I need more context here."

This is structural, not a model problem. LLMs are trained to be helpful. Helpful means filling gaps. Helpful means never telling you "I do not know enough to proceed." When you leave holes in your prompt, the model does not flag them. It guesses. And because the output always sounds confident, you do not realize it guessed until you are deep into QA or worse.

I tried the obvious fix. Better prompts, more detail, more context. Results improved a bit, but the pattern never fully went away. Every prompt had blind spots I could not see. Things I assumed the agent would infer correctly, and it did not. Then I tried spec-driven development. You let the agent generate a huge spec from a small prompt, review it, build from it. Pages of user stories, edge cases, data models. Glancing through it, things still slipped past. The gaps from the original prompt had been inherited. They were just harder to spot now.

The real problem was never about writing a better prompt. It was about direction. I was pushing requirements at the agent, hoping I had covered everything. I needed the agent pulling requirements out of me. This clicked after enough painful cycles of write, build, wrong, fix, rebuild, wrong again. The loop never closed because I was checking my own work, and I could not see my own blind spots.

What changed everything was a tool that flips the direction. Instead of you pushing instructions and hoping, the agent stops and interviews you before it writes anything.

Claude Code ships with the "ask user question" tool. When invoked, the agent pauses. It does not guess. It presents structured, multi-option questions about everything it needs to know, some tagged as recommended. You pick one, it asks the next, and it keeps going until nothing is left unclear. Only then does it build.

Cursor has something similar. Other agents focus purely on output rather than pausing to gather missing context. This is not a minor gap. It is the difference between shipping right once and shipping wrong twice.

When you write a prompt, you guess what the agent needs. When the agent interviews you instead, it discovers what you missed. Edge cases that never reached the prompt box. Assumptions about existing code you take for granted. Tradeoffs you have not even considered.

The first time I saw someone use this deliberately, they pointed the agent at an existing spec and said "interview me about everything I might have missed, implementation details, edge cases, tradeoffs, skip the obvious, keep going until complete, then update the spec." The tool went through seven rounds of questions on that single spec. Each surfaced something the document had omitted. Auth flow mismatches. Data migration paths. What happens when a paid user downgrades with more data than the free tier allows. Production incidents waiting to happen, if nobody had asked.

It kept going until every gap was surfaced, then rewrote the spec. The result reflected what needed to be built, not what a model guessed from a vague starting prompt.

You can use this with or without a spec. Starting fresh? Append "use the ask user question tool to find important requirements" to your first prompt, and the agent opens a dialogue. Is this a web app? Which framework? What data do you actually track? What happens when someone downgrades? The questions come fast and structured. You are making decisions instead of guessing at your own gaps.

The extra time is measured in minutes. The savings, in days. QA finding out a feature does the wrong thing costs far more than five minutes of answering questions upfront. Reverting a bad implementation is not free. Building on a foundation wrong from the first commit compounds quickly.

There is a broader point here. Everyone talks about prompt engineering like it is the new essential skill. Better prompts, better results. But the best prompt engineer in your workflow might actually be the AI itself. Not because it writes prompts for you. Because it knows what it does not know, if you give it the tool to ask.

The agents that actually help you ship good software will not be the ones that guess more accurately. They will be the ones that refuse to guess at all. They will stop, ask, and make you part of the decision. That is not slowing you down. That is keeping you from shipping twice.

Next time you open a coding agent, ask yourself whether you are about to let it fill in blanks you did not know you left. If there is any chance you missed something, make it ask you. Use plan mode if your agent has it. Force the question tool if it is available. Do not let it guess. Your future self reviewing the pull request will thank you.

Knock Before You Commit

Continue Reading.

Overdressed and Underperforming

Your Fundamentals Are Showing

Triggered by a Fork