Dynamic Context Assembly for Complex Coding Tasks

The difference between an AI suggestion that's immediately useful and one that misses the mark entirely almost always comes down to whether the model had the right code in its context window.

What RAG Actually Means When Your Codebase Is the Document

Most engineers encounter "RAG" (Retrieval-Augmented Generation) in the context of document search — a chatbot that looks up FAQs or summarizes PDF reports. Code RAG is structurally the same but operationally very different, because your retrieval unit is not a paragraph: it is a function, an interface, a module, or a usage pattern.

When you work in a large repository, the AI model physically cannot see all of it at once. A typical production codebase is millions of tokens, and even the most capable models top out at 200k–1M tokens per request. The solution is to retrieve only the relevant slices and inject them into context before the model generates. This retrieval step is what separates a useful AI coding assistant from one that hallucinates method signatures or ignores your established patterns.

The core insight is this: the quality of retrieval determines the quality of generation. If the model sees the right interfaces, the right examples, and the right constraints, it produces code that fits. If it sees nothing or the wrong things, it invents plausible-but-wrong code that you have to fix by hand.

Code retrieval differs from document retrieval in one critical way: code has structure. A function depends on types defined elsewhere. A module imports utilities from a shared package. A React component uses a design system's primitives. Retrieval strategies that ignore this structural dependency graph will miss crucial context even when they surface the right files.

Learning tip: Think of context assembly as writing a focused design doc for the AI before asking it to write code. The better your prep work, the less correction work you do after.

How AI Coding Tools Index and Search Your Repository

Tools like Cursor, Codeium, GitHub Copilot (workspace mode), and Zed's AI features all maintain some form of a local index over your codebase. Understanding how that index works helps you understand why these tools sometimes miss things — and how to compensate.

The most common approach is vector embeddings. Each chunk of code (usually a function or a logical block) is passed through an embedding model that produces a dense numeric vector capturing semantic meaning. When you type a prompt, your prompt is also embedded, and the tool retrieves the chunks whose vectors are closest to your query vector. This is why semantic search works: "function that validates user input" will surface a sanitizeFormData() function even if your query shares no literal words with the function name.

Cursor specifically maintains a codebase index that updates as you edit. When you open a new chat with @codebase or reference @docs, Cursor retrieves chunks from that index and injects them into the system prompt before your message. The retrieval is invisible to you unless you inspect the context window. Knowing this, you can improve retrieval quality by being explicit: naming the specific file, function, or concept you want the model to focus on.

The limitation of pure vector retrieval is that it optimizes for semantic similarity but can miss structural dependencies. A file that defines a type used everywhere might rank low in similarity to your specific query, but it is critically important context. This is why the best retrieval strategies combine semantic search with explicit structural awareness.

Learning tip: In Cursor, use @filename to force-inject a specific file into context when you know it matters. Vector search alone won't always surface it.

Manual Retrieval Strategies You Control

Automated tooling does most of the retrieval work, but knowing how to do it manually makes you faster and more precise. There are three main strategies: grep-based, ctags-based, and LSP-based.

Grep-based retrieval is the bluntest instrument and often the right one. When you need to find all callers of a function, all implementations of a pattern, or all places a configuration key is used, grep -rn or rg (ripgrep) gives you an exact answer in seconds. The key skill is knowing what to grep for: function names, type names, error strings, or configuration keys — things that are textually unique.

ctags-based retrieval generates an index of all symbols in your project (functions, classes, types, interfaces) and lets you jump to definitions or find all references. Tools like Universal Ctags, combined with editor integrations, let you quickly pull the definition of any symbol into view. For context assembly purposes, ctags helps you answer "what does this type look like?" before feeding it to the AI.

LSP-based navigation (Language Server Protocol) is the most precise. Modern editors and language servers can answer "find all references to this symbol," "show me the type definition," and "list all implementations of this interface" with compiler-level accuracy. When you need to understand a data flow or assemble context around a specific abstraction, LSP navigation is faster and more accurate than grep.

The practical workflow is to use these tools to gather the right files and symbols, then paste the relevant excerpts directly into your prompt. This manual approach takes more time upfront but gives you precision control over exactly what context the model sees.

Learning tip: Before asking the AI to implement anything in a large codebase, spend 2 minutes using LSP "find references" and "go to definition" to collect the 3–5 most relevant code snippets. Paste them directly in your prompt. Your first response will usually be production-ready instead of needing two rounds of correction.

Static vs. Dynamic Context: Knowing Which to Use

Static context is context that doesn't change based on your immediate task: your project's README, architecture decision records, a style guide, a list of available packages. You might put this in a CLAUDE.md or a system prompt that's always included. This is useful for setting up conventions and constraints that should always be respected.

Dynamic context is context assembled per-task based on what you're specifically doing right now. If you're implementing a new API endpoint, your dynamic context might include: the existing endpoint that's most similar, the request/response types, the authentication middleware signature, and the database layer interface. None of this would be relevant if you were instead fixing a CSS layout bug.

The mistake most engineers make is treating all context as static — either relying entirely on the tool's automatic retrieval or manually maintaining a fixed context file. The higher-leverage approach is to treat context assembly as part of your implementation workflow. Before you prompt, ask yourself: "What does the model need to see to write code that fits this codebase right now?"

A useful mental model is the concentric circles of context:
1. The immediate task (what you want the AI to do)
2. The local contract (the interfaces and types this code must satisfy)
3. The adjacent patterns (existing implementations the new code should resemble)
4. The global constraints (style guide, error handling convention, logging pattern)

Assembling dynamic context means explicitly providing circles 2 and 3. Most tools give you circle 4 via static config and circle 1 via your prompt. The middle two are where your effort has the highest return.

Learning tip: When a model generates code that "looks right but doesn't quite fit," the missing piece is almost always circle 2 or 3 — the local contract or the adjacent pattern. Add those to your next prompt and regenerate.

Chunking Strategies for Code Files

When injecting code into a prompt, you rarely want an entire file. Files are too large and contain sections irrelevant to your task. The right unit of retrieval depends on what the model needs to reason about.

Function-level chunking is the default for most tools and the right choice when you need the model to understand a single behavior: "here is how fetchUserProfile works, now implement fetchUserSettings to match this pattern."

Interface/type-level chunking is the right choice when you need the model to satisfy a contract: paste the TypeScript interface, the Go struct, or the Protobuf definition, and tell the model to implement against it without needing the full implementation of anything.

File-header chunking — just the imports and top-level declarations — tells the model what a module exports without consuming tokens on implementation details. This is useful for surveying a utility module before deciding which function to reference fully.

Cross-file signature chunking is the most powerful for complex tasks: collect the function signatures (not bodies) from several related files to give the model a map of the available API surface without bloating the context with implementation code.

The general rule: give the model structure and contracts, not implementations, unless the task is to understand or transform existing logic.

Learning tip: If your prompt feels too long, strip function bodies and leave only signatures, types, and comments. The model almost always needs to know "what exists and what it looks like" more than it needs to know "exactly how it was implemented."

Hands-On: Building a New API Endpoint with Dynamic Context

In this exercise, you will implement a new REST endpoint in an existing codebase by assembling the right context before prompting, rather than letting the tool guess.

Scenario: You are adding a PATCH /users/:id/preferences endpoint to an Express/TypeScript backend that already has user CRUD endpoints.

Find the existing pattern. Use ripgrep to locate an existing endpoint handler:
bash rg "router\.(get|post|put|patch)" src/routes/ -l
Open the most similar file — likely src/routes/users.ts.
Extract the relevant types. Use your editor's LSP ("go to definition") to find the User type and the UserPreferences type. Copy their definitions.
Find the middleware signature. Grep for the authentication middleware to get its function signature:
bash rg "export.*middleware\|export.*auth" src/middleware/ -A 3
Assemble your context prompt. Open a new AI chat and paste the following, filling in the extracted code:

```prompt
I'm adding a PATCH /users/:id/preferences endpoint to an Express/TypeScript app.

Here is the existing PUT /users/:id handler (the pattern to follow):

typescript [paste the existing handler function here]

Here are the relevant types:

typescript [paste User and UserPreferences type definitions here]

Here is the auth middleware signature:

typescript [paste middleware signature here]

Implement the PATCH /users/:id/preferences endpoint following the same error handling, validation, and response shape patterns as the existing PUT handler. The handler should:
- Validate that the authenticated user can only update their own preferences
- Accept a partial UserPreferences object in the request body
- Return the updated user object on success
- Use the same error response format as the existing handler
```

Review the generated code. Check that it uses the same error handling pattern, the same response shape, and that the TypeScript types are correct. If the model invented a helper function that doesn't exist, note its name.
Verify invented dependencies. If the model referenced a helper like validatePartialObject(), grep for it:
bash rg "validatePartialObject\|validatePartial" src/ -l
If it doesn't exist, ask the model: "The function validatePartialObject doesn't exist in this codebase. What existing utility from the types I provided should be used instead, or should I implement it?"
Iterate with a targeted follow-up prompt if the response shape is wrong:

```prompt
The response shape is not quite right. Here is the actual ApiResponse type used in this codebase:

typescript [paste ApiResponse type]

Rewrite the handler's return statements to use this type correctly.
```

Run the TypeScript compiler to catch remaining type errors:
bash npx tsc --noEmit
Paste any type errors back to the model with: "Fix these TypeScript errors in the handler you just wrote: [errors]"

Learning tip: Steps 1–3 are the real work. The AI generation in step 4 takes seconds. Invest your time in retrieval, and the generation will be accurate on the first try most of the time.

Hands-On: Exploring an Unfamiliar Large Repository

Use this workflow when you've been dropped into a large codebase and need to understand a specific data flow before implementing anything.

Scenario: You are new to a codebase and need to understand how an order is processed from API request to database write, so you can add a new field to the order.

Get a high-level map of the repository structure:
bash find . -name "*.ts" -path "*/routes/*" | head -20 find . -name "*.ts" -path "*/services/*" | head -20
Find the entry point for orders:
bash rg "order" src/routes/ -l
Extract the top-level structure of the orders route file — just imports and exported functions, not bodies — and feed it to the AI:

```prompt
I'm exploring a new codebase. Here is the top of the orders route file (imports and function signatures only):

typescript [paste imports + function signatures without bodies]

Based on these signatures and the imported modules, explain the likely data flow from an incoming POST /orders request through to the database. Name the service and repository functions that are likely called, based on the names I've shown you.
```

Follow the chain. The model will name likely functions. Use LSP "go to definition" or grep to find them and verify.
Assemble the full flow as a prompt context. Once you've traced the path, collect the signatures from each layer:

``prompt I need to add anexpedited: boolean` field to orders. Here is the data flow I've traced:

Route handler signature:
typescript [paste]

Service method signature:
typescript [paste]

Repository method signature:
typescript [paste]

Order database schema (relevant columns only):
sql [paste]

List every file and every function I need to modify to add the expedited field end-to-end, in the order I should make the changes. Be specific — use the exact function names I've shown you.
```

Execute the plan. Use the model's output as a checklist. For each item, open a focused prompt that includes only the function being modified and its immediate dependencies.

Learning tip: The "list what to change end-to-end" prompt in step 5 is one of the highest-leverage prompts in a large codebase. It turns a vague task into a concrete, ordered checklist you can execute file by file.

Key Takeaways

RAG for code is about retrieving the right structural slices — interfaces, type definitions, adjacent patterns — not entire files, and injecting them explicitly into your prompt before generating.
AI coding tools use vector embeddings for semantic retrieval, but they miss structural dependencies. Supplement automatic retrieval with manual grep, ctags, and LSP navigation to collect the exact context a task requires.
The quality of your retrieval directly determines the quality of generated code. Spending 2–3 minutes assembling the right context saves more time than it costs.
Use static context (style guides, conventions, CLAUDE.md) for project-wide constraints, and dynamic context (interfaces, adjacent implementations) for task-specific generation.
When injecting code into a prompt, prefer signatures and type definitions over full implementations — the model needs to know what exists, not how it works internally, to generate code that fits.