Complete Guide to Spec-Driven Development (SDD) with AI in 2026
Practical Guide to Spec-Driven Development (SDD) with AI in 2026: real-world workflow, criteria for writing effective specs, team roles.

This is not just another academic definition of SDD. It’s the guide I wish I had when I started working with AI in product development and realized something fairly uncomfortable: prompting without specs doesn’t scale. Here I explain how to apply Spec-Driven Development with AI in a practical way—without bureaucracy—and with clear criteria you can actually carry into code.
What Is Spec-Driven Development (SDD) and Why It’s Back Now with AI
Spec-Driven Development (SDD) is a simple idea that for years was hard to make practical: define precisely what you want to build before you start building it. The specification (the spec) is not a formality—it’s the center of everything.
For a long time, this sounded good in theory but almost always failed in practice. Last-minute scope changes, the cost of keeping specs updated, and day-to-day urgency meant code ended up being the only truth. With AI, for the first time, this can change in a realistic way.
When you work with language models, the quality of the output depends directly on the clarity of the context you give them. Not on the model, not on some magical prompt, but on the prior context that defines what is correct and what isn’t. And a good spec is basically the best context you can provide. In my case, the “click” moment was realizing that a clear spec reduced more errors than any clever prompt.
Who hasn’t tried at least once—just for fun—sending prompts to an agent like: “Update this project to DDD” or “Migrate to version X”? The result is usually the same: absurd token usage and outputs that are… questionable at best, because the agent has no clear framework to decide against.
SDD isn’t back because of nostalgia or hype. It’s back because AI needs stable instructions, not improvised conversations.
If you already know what SDD is, you can jump straight to how to write effective specs, tools for SDD with AI, or common mistakes when applying Spec-Driven Development. The rest of this article builds that foundation.
The Real Spec-Driven Development Flow with AI (From Idea to Code)
In practice, SDD is not a rigid or bureaucratic process. It doesn’t try to replace other methodologies like DDD, TDD, or BDD. I think of it more as a mental workflow layered on top of how we already work—one that makes even more sense when AI agents enter the equation.
From a Vague Idea to a Well-Formulated Problem
Everything starts with a vague idea. That’s normal. The common mistake is to go from there straight to code or straight to the prompt.
Before writing a single line of spec, I force myself to answer something very simple: what specific problem are we solving, and for whom? If I can’t explain it in a few sentences, I still don’t understand it. And if I don’t understand it, there’s no way an agent will execute it well.
This step, even though it looks trivial, is often where the most value gets created. Forcing yourself to formulate the problem removes ambiguity before it turns into technical debt.
The Spec as a Contract (Not a Pretty Document)
This is where SDD clearly separates from “writing documentation.” The spec is not trying to impress anyone. Its only goal is to prevent misunderstandings.
A good spec makes three things crystal clear:
- What the system does
- What it does not do
- How we know it’s correctly done
When I started working like this, a lot of discussions disappeared. Not because everyone agreed, but because the spec forced decisions to happen before code existed.
From Spec to Actionable Tasks
A well-written spec can almost be broken down into tasks by itself. At this point AI helps a lot: it surfaces edge cases, dependencies, and logical gaps we often miss.
Here’s an important warning: if the spec is ambiguous, AI won’t fix it. It will amplify it.
This process doesn’t have to be fully manual. In my case, I often work the spec inside a ChatGPT conversation (canvas mode), iterating section by section until the text is clear enough that another human—or an agent—can execute it without making assumptions. Interestingly, because of how models work, this process often reveals extreme scenarios or constraints we hadn’t considered at the start.
AI-Assisted Code (With Clear Limits)
With a solid spec, AI stops “inventing” and starts executing. It generates code, tests, and structures with much more coherence and less noise.
One thing I learned quickly: when I don’t like the output, the problem is almost never the model. The problem is usually the spec. Reviewing and refining it almost always works better than switching tools or prompts.
Validate Against the Spec, Not Opinions
The last step isn’t asking “does it work?”, but “does it meet what we promised?”. The spec becomes the truth criterion.
If something fails, you don’t patch the code first. You review the spec first. More often than it seems, the issue was there.
What Makes a Spec Truly Good (Practical Criteria)
Not every spec works well with AI. Some look complete, well structured, even “professional,” but the moment you try to use them with an agent, they start leaking.
Over time I’ve reached a pretty clear conclusion: a good spec isn’t the one that explains more—it’s the one that minimizes what needs to be interpreted.
Context That Actually Constrains (No Fluff)
Context isn’t there to tell the project’s story. It’s there to constrain decisions: who will use this, in what situation, and with what real limitations.
When a spec says “end user” or “general use,” it’s basically saying nothing. When you define a concrete context, AI—and any teammate—stops imagining scenarios that don’t exist.
A Spec Must Be Able to Fail
If what you’re defining can’t be falsified, you haven’t thought enough.
If, after development, you can’t clearly say whether the goal was met or not, then it wasn’t a good goal. This “philosophical-sounding” point is especially important with AI agents: they tend to complete anything not clearly defined, and that’s how they hallucinate extra flows or unexpected behaviors.
The Real Value of Exclusions
This is where many specs fall short. Saying what’s out of scope can feel unnecessary… until you don’t do it.
In my experience, defining exclusions reduces more errors than adding features. It reduces context, prevents nasty surprises from scope changes you didn’t expect, and ensures that a not-quite-clear spec doesn’t modify business rules that should not be touched. Remember: the Domain doesn’t adapt (and when someone tries to force it to, “interesting” things tend to happen).
Avoid Redundancy
Fortunately, standards are becoming part of daily work when collaborating with agents. If you find yourself repeating things in every spec like “generate unit tests in this or that way,” the sensible move is to optimize by moving those instructions into files like Agents.md or guidelines.md.
That way, the agent knows how it should behave from the beginning, but you don’t have to keep repeating the same blocks. The spec stays centered on the problem and behavior—which is where it actually adds value.
How Roles Fit in a Team Working with Spec-Driven Development
One of the most common traps when talking about SDD is to describe it as “more documentation.” In a real team, the change isn’t writing more—it’s moving the point where important decisions are made.
When the spec is the center, many classic frictions disappear—or at least show up earlier, when they’re still cheap to solve.
PM / Product: The Spec as an Alignment Tool (Not a Deliverable)
Product doesn’t write a spec to “dump work” on engineering. The spec is their main tool to align the team around three very concrete things:
- what problem is being solved,
- what’s in scope and what is explicitly out,
- and what exactly it means for something to be done well.
In teams working with AI, this becomes obvious fast. A loosely scoped spec quickly turns into endless prompts, implicit decisions, and mid-stream direction changes. A clear spec reduces noise and makes decisions visible from day one.
Dev: Less Interpretation, More Deliberate Design
The developer’s job stops being “guess what product meant.” Their role becomes to stress-test the spec with uncomfortable but necessary questions:
- what edge case are you assuming without saying it?
- what implicit dependency exists here?
- what technical constraint makes this unviable, fragile, or expensive?
The key is that these conversations happen before there’s code. With AI, this is critical: the later an ambiguity appears, the more expensive it is to fix.
QA: Turn Ambiguity into Verifications
QA stops arriving at the end to “break things.” In SDD, QA joins earlier to transform imprecise language into testable criteria:
- critical scenarios,
- negative cases (what must NOT happen),
- and tests that validate behavior, not implementation.
A practical rule I often use: if a spec can’t be tested, it’s probably not ready to implement.
AI: Executes Fast, Interprets Poorly
I treat AI like a teammate with two very clear characteristics:
- it’s extremely fast at executing, and
- it’s dangerously good at filling in gaps.
The spec exists precisely to reduce that interpretation space. Not to limit AI, but to prevent it from making decisions that were never meant to be its job.
Practical Checklist Before Calling a Spec “Ready”
This is probably the least glamorous part of the process… and also one of the most useful. It’s not a checklist to check boxes—it’s a quick final pass to catch ambiguities before they become extra work.
This is the checklist I use when I want to avoid surprises (human or AI):
- Is the problem understandable without additional context?
If someone who wasn’t in the original conversation can’t understand it, an agent won’t either. - Can the objective fail?
Could you clearly say “this doesn’t meet the spec”? If not, there’s ambiguity. - Is what’s out of scope explicit?
If there are no clear exclusions, someone (human or AI) will invent them. - Are acceptance criteria verifiable and binary?
No “should,” “ideally,” or “more or less.” It either passes or it doesn’t. - Are there enough examples for the important flows?
At least one per critical flow. Examples reduce interpretation more than any definition. - Does the spec describe behavior, not implementation?
If you’re deciding architecture here, you’re probably mixing layers. - Could an agent execute this spec without asking questions?
This is the final test. If the answer is no, the spec isn’t mature yet.
Practical rule: if any box is “not sure,” that’s fine. Going back to the spec now is infinitely cheaper than doing it once you already have code, tests, or prompts in production.
If any answer is “no,” it’s fine—you go back to the spec. It’s infinitely cheaper to adjust here than once you already have code, tests, or prompts set up.
Simple Spec Template (The One I Actually Use)
The template matters less than the discipline of using one consistently. This is the one I use because it forces me to think about exclusions and acceptance criteria from the start:
Title
Problem to solve
Usage context
Objective
Scope
- Includes
- Excludes
Expected behavior
- Main flow
- Alternative flows / edge cases
Acceptance criteria
- Given / When / Then
Constraints
- Technical / business / security
Notes
- Open decisions / questionsIf you force yourself to fill out Excludes and at least 2–3 verifiable acceptance criteria, you’re already ahead of 90% of the specs I see.
Real Example: Spec for User Registration (Backend)
To avoid ambiguity and stay focused on what matters, this example is deliberately focused on the backend process, not the UI. The idea is to show how a clear spec lets you implement the logic without arguing over frontend details.
Title
User registration (backend)
Problem to solve
Allow a system to create user accounts securely and consistently, ensuring data integrity and preventing duplicate registrations.
Usage context
Backend service that receives registration requests from different clients (web, mobile, external API). The backend is the single source of truth for user creation.
Objective
Given a valid email and password, the system creates a new persisted user and leaves the account in a pending verification state.
Scope
- Includes:
- User registration endpoint
- Unique email validation
- Password validation according to a defined policy
- User persistence
- Emission of a "user registered" event
- Excludes:
- Form rendering
- Session management or login
- Sending the verification email
- Password recovery or change
Expected behavior
- The system receives a registration request with email and password
- Verifies that the email does not already exist
- Validates the password according to the established rules
- Stores the password securely (hash)
- Creates the user with "pending_verification" status
- Publishes a domain event indicating that the user has been registered
Acceptance criteria
- Given a non-registered email and a valid password, when the request is processed, then a new user is created with "pending_verification" status
- Given an existing email, when the request is processed, then no user is created and a conflict error is returned
- Given a password that does not meet policy, when the request is processed, then the user is not created and a validation error is returned
Constraints
- The password hash must comply with the system’s security standards
- The endpoint must be idempotent in the face of retries
- The system must comply with GDPR in email processing
Notes
- Sending the verification email is handled in another service
- The exact event format is defined in a separate specThis example shows how a backend-focused spec reduces unnecessary discussions and allows humans and agents to implement logic without having to infer responsibilities that don’t belong to them.
