Hands-On: Write a Production Quality Technical Spec

The difference between an AI agent that ships working code and one that produces an expensive mess is almost always the quality of the spec it was given — this topic teaches you to write the kind of spec that reliably produces the former.

Why Specs Are the Highest-Leverage Artifact in Agentic Development

When you work alongside an AI agent on a non-trivial feature, you are essentially handing off implementation to a very fast, very literal executor. The agent will interpret your instructions precisely — and if those instructions are ambiguous, incomplete, or missing edge cases, the agent will fill the gaps with plausible-sounding guesses. Some of those guesses will be wrong. The further the agent gets down an incorrect path, the more expensive the correction becomes.

A production-quality technical spec is the antidote. It is not the kind of dense requirements document that used to take a week to write and would be ignored by the third sprint. It is a focused, structured contract between you (the engineer who understands the problem) and the agent (the executor who will implement it). It defines what the feature does, what it does not do, how it behaves at its edges, and what "done" looks like in verifiable terms.

The practical payoff is significant. A well-written spec given to an AI agent typically produces a first implementation that requires minor corrections rather than a rewrite. It also front-loads the hard thinking. Every ambiguity you resolve in the spec is an ambiguity you are not debugging in production. Engineers who adopt spec-driven development with agents consistently report that the quality gate shifts left — problems surface at the spec review stage rather than at PR review or worse, in a production incident.

AI is also a powerful collaborator during the spec-writing process itself — not just during implementation. You can use it to surface edge cases you missed, pressure-test your acceptance criteria, draft boilerplate sections quickly, and run a structured review pass before you hand the spec to an agent. This topic walks through the full workflow using a realistic, non-trivial feature.

Learning tip: Resist the urge to hand a rough set of bullet points to an agent and call it a spec. Spend fifteen minutes with this workflow first. You will save hours of debugging and back-and-forth later.

The Anatomy of a Spec That Works for Agents

A spec that produces reliable agent output needs to answer six questions, each in its own section. This structure is not arbitrary — each section corresponds to a type of decision an agent must make when generating code.

Context answers: What already exists that this feature must fit into? An agent that knows the existing auth middleware, the database schema, and the email service abstraction will generate code that integrates correctly. An agent without this context will invent integration points that do not exist.

Problem statement answers: What user problem is being solved and why? This prevents the agent from optimizing for the literal request at the expense of the actual goal. It is one to three sentences, written from the user's perspective.

Functional requirements answers: What must the feature do? These are numbered, testable statements in the form "the system must..." or "when X, the system does Y." Avoid prose — one requirement per line.

Non-functional requirements answers: What constraints apply to how the feature does it? Security, performance, backwards compatibility, observability — things the agent would not know to care about unless told.

Edge cases and error handling answers: What are the failure modes and boundary conditions? This is the section agents are most grateful for. Token expiry, already-used links, users who do not exist, concurrent requests — enumerate them explicitly.

Acceptance criteria answers: How will you verify this feature is done? Specific, binary, testable statements. Each criterion maps to either a unit test or an end-to-end scenario.

Learning tip: If you cannot write a binary acceptance criterion for a requirement — one you could evaluate as clearly passing or failing — the requirement is not specific enough yet. Use AI to help you sharpen it.

Hands-On: Writing the Spec for Magic Link Authentication

The feature is: Add email-based magic link authentication to an existing Express.js and PostgreSQL application. The app currently has username/password login. The task is to add a parallel flow where a user enters their email, receives a single-use link, clicks it, and is authenticated. Work through the following steps to produce a complete, production-ready spec using AI as a collaborator throughout.

Step 1: Gather context from the existing codebase

Before writing anything, give the AI a picture of the system the feature must integrate with. Paste in your relevant existing code — your User model, your current auth middleware, your session management, your email service interface — and ask it to summarize the integration surface.

I am about to write a technical spec for adding magic link authentication to an Express.js + PostgreSQL application. Here is the relevant existing code:

[paste your User model schema]
[paste your current auth middleware]
[paste your session/JWT creation logic]
[paste your email service interface]

Summarize the integration surface I need to account for in my spec: what fields already exist on the User model, how sessions are currently created, how the email service is called, and what auth middleware the new feature must be compatible with. List any gaps — things I will need to add or change — as questions I should answer in the spec.

Expected output: A structured summary of your integration surface — something like "the User model has id, email, passwordHash, createdAt; the email service has a sendTransactional(to, subject, body) method; the existing session is a JWT stored in an httpOnly cookie — the new flow needs to produce the same JWT format." You will also get a list of questions to answer, such as whether the magic link should work for users who only have password-based accounts, or only for passwordless-only accounts. Answer those questions before moving on.

Step 2: Draft the problem statement and context section

Use the summary from Step 1 to write a tight problem statement. Do not over-engineer this — two to four sentences.

Write the Context and Problem Statement sections of a technical spec for this feature:

Feature: Email-based magic link authentication
App: Express.js REST API + PostgreSQL, existing password-based auth
Existing integration points: [paste the summary from Step 1]

The problem: Users want a frictionless login option that does not require remembering a password. The system should support both password login and magic link login for the same account.

Write Context as 3–5 sentences describing the system and where this feature fits. Write Problem Statement as 2–3 sentences from the user's perspective. Be specific, not generic.

Expected output: A context section that names your actual tech stack, references the existing auth mechanism, and notes what must remain backward-compatible. A problem statement that would make sense to both a product manager and an engineer.

Step 3: Write the functional requirements

This is where precision pays off. Ask the AI to generate a first draft of numbered, testable functional requirements, then you review and harden each one.

Write the Functional Requirements section for a magic link authentication feature. Requirements must be:
- Numbered (FR-1, FR-2, etc.)
- In the form "The system must..." or "When [condition], the system must..."
- Testable — each one should map to a verifiable behavior

Cover the complete happy path: user submits email, system sends link, user clicks link, system authenticates them and starts a session. Also cover: what the link contains, how long it is valid, whether it is single-use, rate limiting on the request endpoint, and what happens after successful authentication (redirect vs. token response).

App context: Express.js + PostgreSQL, JWT session in httpOnly cookie, existing POST /auth/login endpoint for password auth.

Expected output: A numbered list of 8–12 requirements covering the full happy path. Review each one: is it specific enough to generate a failing test? FR-1 might read "The system must accept a POST request to /auth/magic-link/request with a JSON body containing an email field." That is testable. "The system must handle email requests" is not.

Step 4: Use AI to surface edge cases you missed

This is where AI collaboration pays its biggest dividend. After drafting your functional requirements, explicitly ask the AI to find the holes.

Here are the functional requirements I have written for a magic link authentication feature:

[paste your FR-1 through FR-N]

You are a security-focused senior engineer doing a spec review. Identify:
1. Edge cases not covered by these requirements (race conditions, concurrent requests, token reuse attacks, account enumeration risks, etc.)
2. Security vulnerabilities the current spec would allow
3. Missing error states — scenarios where the system receives unexpected input and the spec does not define what should happen
4. Backwards compatibility issues with the existing password login flow

For each issue, suggest a specific additional requirement to address it.

Expected output: A list of 5–10 gaps. Common ones for this feature include: what happens when someone requests a magic link for an email that does not exist in the system (account enumeration risk — the spec should require returning the same response whether the email exists or not); what happens if a user clicks the link twice; whether the link is invalidated after the password is changed; whether there is a maximum number of outstanding links per user at a time. Each gap becomes a new requirement or a clarification to an existing one.

Step 5: Write error handling and edge cases section

Take the gaps identified in Step 4 and formalize them.

Write an Edge Cases and Error Handling section for the magic link auth spec. Cover each of these scenarios with a specific system behavior:

1. Email submitted for an account that does not exist
2. Magic link that has expired (TTL exceeded)
3. Magic link that has already been used
4. Magic link request submitted more than N times within a time window (rate limit)
5. Magic link request submitted for an account that is suspended or deactivated
6. User has an outstanding magic link and requests another one
7. Magic link token is malformed or tampered with
8. Concurrent requests that arrive within milliseconds of each other for the same token

For each scenario, specify: the HTTP status code, the response body shape, any side effects (e.g., logging, alerts), and whether the error is user-facing or silent.

Expected output: A structured table or numbered list that gives the implementing agent exact instructions for every failure mode. An agent with this section will not guess what to return for a used token — it will know to return 400 with {"error": "TOKEN_USED"} and log the attempt.

Step 6: Write acceptance criteria

Acceptance criteria are the spec's contract with the testing phase. They must be binary — either the system satisfies them or it does not. Ask AI to generate a first draft, then tighten any criteria that are subjective or immeasurable.

Write Acceptance Criteria for the magic link authentication feature based on these functional requirements and edge cases:

[paste FR-1 through FR-N]
[paste Edge Cases section]

Each criterion must:
- Be phrased as "Given [context], when [action], then [outcome]"
- Be binary — clearly pass or fail
- Reference specific HTTP status codes, response shapes, or observable behaviors

Group them into: Happy Path, Error States, Security, and Performance.

Expected output: 15–25 acceptance criteria organized into groups. A good happy path criterion looks like: "Given a valid, unused magic link token, when a POST request is made to /auth/magic-link/verify, then the response is 200 with a Set-Cookie header containing a valid JWT, and the token is marked as used in the database." A weak one would be "the login works." Reject any criterion you cannot turn into an automated test.

Step 7: Do a spec review pass with AI

Before finalizing the spec, run a structured review. Think of this as handing the spec to a senior engineer who will ask hard questions.

You are a principal engineer reviewing the following technical spec before it is handed to an AI agent for implementation. Your job is to find any ambiguity, missing information, or underspecification that would cause the agent to make an incorrect assumption.

[paste the full spec draft]

For each problem you find:
1. Quote the ambiguous or missing section
2. Explain why an agent would make an incorrect assumption from it
3. Suggest the specific clarification or addition needed

Focus especially on: database schema changes needed, API contract details (exact request/response shapes), security implementation details (how the token should be generated and stored), and integration points with existing code.

Expected output: A prioritized list of clarifications. You will typically find 3–6 issues: the spec says "generate a secure token" without specifying the algorithm or length; the spec says "send the magic link" without specifying the exact URL format; the spec mentions rate limiting without specifying the storage mechanism (in-memory, Redis, DB). Address each one by adding specificity to the relevant section.

Step 8: Add the non-functional requirements

After the review pass, fill in the non-functional requirements section. These are easy to forget and costly to retrofit.

Write a Non-Functional Requirements section for the magic link authentication spec. Cover:

- Token security: how the token should be generated (cryptographic randomness, length, format)
- Token storage: where magic link tokens are stored, what columns the table needs, indexing strategy
- Token TTL: the expiry window (suggest 15 minutes) and how expiry is enforced
- Email delivery: what happens if the email fails to send (transactional email, not fire-and-forget)
- Rate limiting: specific thresholds and the mechanism (suggest: 5 requests per email per hour, stored in the existing PostgreSQL DB using a token_requests table)
- Observability: what events must be logged and at what level (request received, token sent, token used, token expired, rate limit hit)
- Performance: acceptable latency budget for the /request endpoint including email dispatch
- Backwards compatibility: existing password login must continue to work without changes

App constraints: PostgreSQL database, no Redis available, transactional email via existing SendGrid abstraction.

Expected output: A concrete NFR section that prevents common shortcuts. The token generation requirement might read: "Tokens must be generated using crypto.randomBytes(32).toString('hex') — 64 hex characters — and stored as a SHA-256 hash in the database." This level of specificity means the agent does not substitute Math.random() or store plaintext tokens.

Step 9: Validate the spec produces high-quality agent output

The final step is to prove the spec works. Hand it to your AI coding agent and ask it to implement the first endpoint — the token request endpoint — then evaluate the output against your acceptance criteria.

Implement the magic link token request endpoint for this Express.js + PostgreSQL application according to the following technical spec.

[paste the full spec]

Implement only: POST /auth/magic-link/request
Include: the database migration for any new tables, the route handler, the service function, the repository function, the Zod validation schema, and the unit tests. Follow the project conventions in CLAUDE.md.

Read the output carefully. Check it against your acceptance criteria one by one. A well-written spec should produce output that passes 80–90% of criteria on the first pass without additional prompting. If you see the agent guessing — using Math.random() for token generation, omitting the rate limiting logic, not hashing the stored token — those are spec gaps. Go back and add the missing specificity, then re-run. Each iteration tightens the loop between spec quality and implementation quality.

Complete Spec Example

Below is the complete spec produced by working through the steps above. Use it as a reference for your own features.

Feature: Magic Link Authentication
App: Express.js 4 + PostgreSQL (via Prisma), existing JWT auth in httpOnly cookie
Author: [Your Name] | Date: [Date] | Status: Approved

Context

The application is a Node.js REST API (Express 4, TypeScript) backed by PostgreSQL via Prisma. Authentication currently uses password-based login at POST /auth/login, issuing a JWT stored in an httpOnly cookie (auth_token, 7-day expiry). The User model has id (UUID), email (unique), passwordHash (nullable), status (active | suspended), createdAt, updatedAt. Email delivery uses an existing EmailService abstraction (sendTransactional(to: string, subject: string, htmlBody: string): Promise<void>). This feature adds a parallel authentication path; the password login flow must not be changed.

Problem Statement

Users want to authenticate without managing a password, particularly for low-frequency use cases where password recall friction causes drop-off. The system must allow a user to receive a single-use, time-limited link via email that, when clicked, produces a valid session identical to one created by password login.

Functional Requirements

FR-1: The system must expose POST /auth/magic-link/request accepting { "email": string }.
FR-2: The endpoint must respond with 202 Accepted and body { "message": "If that email is registered, a link has been sent." } regardless of whether the email exists in the system.
FR-3: When the email matches an active user, the system must generate a cryptographically secure token, store its SHA-256 hash in the magic_link_tokens table, and send the magic link email within the same request lifecycle.
FR-4: The magic link URL must follow the format https://{APP_BASE_URL}/auth/magic-link/verify?token={raw_token}.
FR-5: The system must expose POST /auth/magic-link/verify accepting { "token": string }.
FR-6: On a valid, unexpired, unused token, the verify endpoint must mark the token as used, create a JWT session identical to the password login flow, set the auth_token httpOnly cookie, and respond 200 OK with { "userId": string }.
FR-7: The system must enforce a rate limit of 5 requests per email address per 60-minute rolling window, tracked in PostgreSQL.
FR-8: Magic link tokens must expire 15 minutes after creation.

Non-Functional Requirements

NFR-1 (Token generation): Tokens are generated with crypto.randomBytes(32).toString('hex'). The raw token is sent in the email; only the SHA-256 hash (crypto.createHash('sha256').update(token).digest('hex')) is stored.
NFR-2 (Token storage): New table magic_link_tokens: id UUID PK, userId UUID FK → users.id, tokenHash VARCHAR(64) UNIQUE NOT NULL, expiresAt TIMESTAMPTZ NOT NULL, usedAt TIMESTAMPTZ NULL, createdAt TIMESTAMPTZ DEFAULT NOW(). Index on tokenHash. Index on (userId, createdAt) for rate limit queries.
NFR-3 (Email failure): If EmailService.sendTransactional throws, the token record must be deleted and the endpoint must return 503 Service Unavailable with { "error": "EMAIL_SEND_FAILED" }.
NFR-4 (Observability): Log at INFO level: request received (email domain only, not full address), link sent, token verified. Log at WARN level: rate limit hit, expired token used, already-used token presented. Log at ERROR level: email send failure.
NFR-5 (Performance): P99 latency for /request must be under 2 seconds including email dispatch. P99 for /verify must be under 200ms.

Edge Cases and Error Handling

Scenario	HTTP Status	Response Body	Side Effect
Email not found	202	`{ "message": "If that email..." }`	None — prevents enumeration
Account suspended	202	`{ "message": "If that email..." }`	None — silent
Rate limit exceeded	429	`{ "error": "RATE_LIMIT_EXCEEDED" }`	Log WARN
Token expired	400	`{ "error": "TOKEN_EXPIRED" }`	Log WARN
Token already used	400	`{ "error": "TOKEN_USED" }`	Log WARN
Token malformed/not found	400	`{ "error": "TOKEN_INVALID" }`	Log WARN
Email send failure	503	`{ "error": "EMAIL_SEND_FAILED" }`	Delete token, Log ERROR
Concurrent verify requests (same token)	First: 200, Second: 400 `TOKEN_USED`	DB unique constraint + transaction isolation

Acceptance Criteria

Happy Path
- AC-1: Given a POST to /auth/magic-link/request with a registered active user's email, the response is 202 and one email is dispatched containing a URL matching */auth/magic-link/verify?token=*.
- AC-2: Given the token from AC-1 sent to POST /auth/magic-link/verify within 15 minutes, the response is 200, the Set-Cookie header contains auth_token, the JWT payload contains the correct userId, and the token's usedAt is set in the database.
- AC-3: Given a successful magic link login, the resulting session is accepted by all existing authenticated endpoints.

Error States
- AC-4: A token presented a second time returns 400 with { "error": "TOKEN_USED" }.
- AC-5: A token presented after 15 minutes returns 400 with { "error": "TOKEN_EXPIRED" }.
- AC-6: A request with an email that does not exist in the database returns 202 and no email is dispatched.
- AC-7: A sixth request within 60 minutes for the same email returns 429 with { "error": "RATE_LIMIT_EXCEEDED" }.

Security
- AC-8: The magic_link_tokens table contains the SHA-256 hash, not the raw token.
- AC-9: The response to /request is identical (status code, body, timing within 50ms) whether the email exists or not.

Backwards Compatibility
- AC-10: Existing POST /auth/login with valid credentials continues to return 200 and set auth_token without modification.

Key Takeaways

A production-quality spec for an AI agent is a contract with six sections: context, problem statement, functional requirements, non-functional requirements, edge cases/error handling, and acceptance criteria. Each section eliminates a class of agent guessing.
Use AI as a collaborator during spec-writing, not only during implementation. Prompts for surfacing edge cases, drafting requirements, and running a structured review pass each produce distinct, high-value output.
The edge case surface prompt — asking AI to find security vulnerabilities, missing error states, and race conditions — consistently returns 5–10 issues that would otherwise surface as bugs. Run it on every non-trivial spec.
Acceptance criteria are the spec's quality gate. If you cannot write a binary pass/fail criterion for a requirement, the requirement is underspecified. Sharpen it before handing the spec to an agent.
Validate the spec by implementing one endpoint first. If the agent output passes 80–90% of acceptance criteria without additional prompting, the spec is ready. Each gap in the output is a gap in the spec — add specificity and re-run.