Providing Codebase Context to AI Agents Effectively

The quality of your AI assistant's output is bounded by how much it understands about your codebase — and you control that boundary.

How AI Coding Tools Index Your Repository

Before writing a single prompt, it helps to understand what the AI tool actually "sees" when you open a project. Claude Code and Cursor do not read your entire repository on every request — that would be prohibitively slow and expensive. Instead, they build lightweight representations of your codebase.

Claude Code constructs a repo map: a compressed outline of your project's file tree, function signatures, class definitions, and exported symbols. This map fits within the model's context window and gives the AI a navigational skeleton of your project without loading every line of source code. When you ask Claude Code to "add a method to the UserService," it can locate that class through the repo map and then pull in the relevant file content for the actual task.

Cursor takes a similar approach using a symbol graph — an indexed database of definitions, references, and imports that gets built incrementally as you work. Cursor's codebase indexing (enabled by default) means it can answer questions like "where is parseConfig defined?" without you pointing it there. The @Codebase command in Cursor tells the model to run a semantic search against this index before answering.

The important implication: auto-indexing is not omniscient. Both tools apply heuristics to decide what is worth indexing. Files listed in .gitignore, binary assets, large generated files, and deeply nested node_modules are typically excluded. This is usually the right behavior, but it means that some files you care about — a large auto-generated schema file, a legacy config outside the normal tree — may be invisible to the AI by default. You need to know how to fill those gaps manually.

Learning tip: Run claude --print-context (or look at Cursor's Index Status in settings) to see which files were included in your last session's repo map. This makes the invisible visible and helps you debug "why didn't it know about X?"

Manual File Inclusion: @file and #file Syntax

When the repo map is not enough — or when you know exactly which files are relevant to your task — you can inject file content directly into the context window using explicit file references.

In Claude Code, prefix a file path with @ in your prompt:

@src/services/auth.ts @src/models/user.ts
The login method in AuthService is throwing a 401 even when credentials are valid.
Walk me through the token validation logic and tell me where it could fail.

Claude will load the full content of both files into context before processing your question. This is the most reliable way to ensure the model is looking at the right code — not a cached or summarized version of it.

In Cursor, use #file:path/to/file.ts syntax in the chat panel, or type @ and select files from the autocomplete picker. You can also drag files directly from the explorer into the chat input.

Strategies for effective file inclusion:

Include the file where the bug lives and the files it imports from. The AI cannot follow an import it has not seen.
For tasks that touch a data flow (e.g., "trace this API call from handler to database"), include the handler, the service layer, and the repository/model in one prompt.
Avoid including files that are clearly unrelated. Each file you add consumes context window space that could hold the model's reasoning or generated output. More is not always better.
For configuration-heavy tasks (e.g., debugging a webpack build), include the specific config file rather than describing it in prose. The AI reads code more reliably than descriptions of code.

Learning tip: When a task spans more than four or five files, consider breaking it into smaller focused prompts, each with its own tight set of file inclusions. Large unfocused contexts produce large unfocused responses.

Writing a High-Quality CLAUDE.md

CLAUDE.md is a Markdown file you place at the root of your repository (and optionally in subdirectories). Claude Code reads it automatically at the start of every session, before you type anything. Think of it as a briefing document you hand to a new contractor before they write their first line of code.

A well-written CLAUDE.md eliminates the repetitive "before you start, know that we use X pattern" preamble that engineers otherwise have to paste into every prompt. It also reduces hallucinations rooted in the AI making false assumptions about your tech stack, naming conventions, or architectural boundaries.

What to include

Architecture overview (2–4 sentences). Describe the top-level structure: is this a monorepo? A microservices system? A Next.js app with a separate API? State the main entry points.

Key directories and their purpose. A short table or bullet list mapping folder names to responsibilities is worth more than paragraphs of description.

Tech stack and versions. List the frameworks, runtime versions, and major libraries. "We use React 18 with the App Router, not Pages Router" saves the AI from generating outdated patterns.

Coding conventions. What the linter enforces, naming patterns you follow by team convention (not just the linter), file naming rules, how you structure exports.

Key files to know. Point out files that are load-bearing but might not be obvious: the global error boundary, the API client singleton, the theme provider, the seed script.

Gotchas and known constraints. This is the highest-value section. Examples: "Do not import from @internal packages in feature modules," "All DB queries must go through the repository layer, never call Prisma directly from a controller," "The legacy/ directory is not type-checked — do not use it as a reference."

What not to include: Avoid dumping your entire README.md into CLAUDE.md. Long marketing copy, setup instructions for new developers, and historical context about why a technology was chosen all add noise without helping the AI write better code. Keep CLAUDE.md focused on what the AI needs to be a competent contributor, not what a human developer needs to understand the business.

CLAUDE.md Template


## Architecture Overview
[2–4 sentences describing the overall system structure, main entry points,
and how major pieces fit together.]

## Repository Structure
| Directory        | Purpose                                      |
|------------------|----------------------------------------------|
| `src/app/`       | Next.js App Router pages and layouts         |
| `src/components/`| Shared UI components (co-located with tests) |
| `src/services/`  | Business logic, external API calls           |
| `src/lib/`       | Pure utility functions, no side effects      |
| `prisma/`        | DB schema, migrations, seed scripts          |

## Tech Stack
- **Runtime:** Node.js 20, TypeScript 5.3 (strict mode)
- **Frontend:** Next.js 14 (App Router), Tailwind CSS v3
- **Backend:** tRPC v11, Prisma ORM, PostgreSQL 15
- **Testing:** Vitest (unit), Playwright (e2e)

## Coding Conventions
- All components are function components with named exports (no default exports).
- Files are named in kebab-case. Component files match their export name.
- Async server actions live in `src/app/actions/` — never inline in components.
- Use `zod` for all runtime validation; do not write manual type guards.
- CSS is utility-first (Tailwind). Avoid writing custom CSS unless strictly necessary.

## Key Files
- `src/lib/api-client.ts` — singleton Axios instance with auth interceptors
- `src/lib/auth.ts` — NextAuth configuration, session shape
- `src/server/trpc.ts` — tRPC router setup, middleware, context type
- `prisma/schema.prisma` — source of truth for all data models

## Gotchas
- Do NOT call Prisma directly in tRPC procedures. Always go through `src/server/repositories/`.
- The `src/legacy/` directory is JavaScript (no TS). Do not use it as a style reference.
- Environment variables are validated at startup in `src/lib/env.ts`. Add new vars there first.
- Playwright tests run against a seeded test database. Seed data is in `prisma/seed-test.ts`.

Learning tip: Commit your CLAUDE.md to version control and treat it like code. When your team changes a convention, update CLAUDE.md in the same PR. Stale context is worse than no context — it actively misleads the AI.

Controlling What the AI Ignores: .cursorignore and .gitignore Interaction

Both Claude Code and Cursor respect .gitignore by default when building their repo maps and indexes. This means anything you have already excluded from version control — build artifacts, dist/, coverage/, node_modules/ — is also excluded from AI context automatically.

Cursor's .cursorignore works exactly like .gitignore but applies only to Cursor's indexing. You would use it to exclude files that are committed to the repo but should not pollute the AI's context:

src/legacy/          # historical code, not a style reference
**/*.generated.ts    # auto-generated GraphQL types — don't read, don't edit
docs/api/            # large auto-generated docs, not useful for coding tasks
fixtures/            # large test fixture JSON files

Claude Code does not have a dedicated ignore file, but you can influence its repo map by being explicit in CLAUDE.md: tell it which directories to treat as read-only references versus active codebases, and which files it should never modify.

What to actively exclude:

Large generated files (GraphQL schemas, OpenAPI clients, i18n bundles) — they consume context space and should not be manually edited anyway.
Test fixtures and snapshot files — the AI does not need to read these to help you write code.
Documentation directories that are purely rendered output.
Compiled or minified assets.

The goal is a clean signal: the AI should be seeing the files a skilled human contributor would actively work in, nothing more.

Learning tip: After setting up your ignore rules, test them by asking the AI "what files do you know about in the legacy/ directory?" If it answers accurately that it has no knowledge of those files, your exclusions are working.

Hands-On: Setting Up Codebase Context for a New Feature

This exercise walks through the full workflow of giving Claude Code effective context before starting a real feature task.

Scenario: You are adding a user preferences endpoint to an existing Express/TypeScript API.

Step 1: Audit your repo map.
Open your project in Claude Code and ask it what it already knows.

List the top-level directories in this project and describe the purpose of each one based on what you can see. Flag any directories where you are uncertain about the purpose.

Expected output: A structured list of directories with descriptions. Note the ones it marks as uncertain — those are candidates for explicit documentation in CLAUDE.md.

Step 2: Identify the relevant files for your task.
Before writing any code, ask the AI to locate the files it will need.

I need to add a GET /users/:id/preferences endpoint. Based on the repo structure, which existing files should I look at before writing this feature? List them with a one-line explanation of why each is relevant.

Expected output: The AI should identify the router file, an existing user-related controller, the User model/schema, and any middleware files. If it misses obvious files, note the gap and add context manually.

Step 3: Load the relevant files explicitly.
Use the file paths identified in step 2 to load them into context.

@src/routes/users.ts @src/controllers/user.controller.ts @src/models/user.model.ts
Review these files and describe the existing patterns: how routes are registered, how controllers are structured, and what the User model looks like. I'll use this as a baseline for the new preferences endpoint.

Expected output: A concise summary of the patterns — routing style, controller method signatures, model structure. This confirms the AI has the right context before it writes any code.

Step 4: Create or update your CLAUDE.md.
Based on what the AI found (and gaps you noticed), create a CLAUDE.md at the project root. Include the architecture overview, the key directories, the stack, and any gotchas you already know.

After creating the file, verify the AI reads it:

Read CLAUDE.md and summarize the three most important constraints I've listed that you should follow when writing new code in this project.

Expected output: The AI restates your top constraints in its own words. If it misses something important, reword that section in CLAUDE.md to be more explicit.

Step 5: Implement with context in place.
Now write the feature with confidence that the AI has the right foundation.

@src/routes/users.ts @src/controllers/user.controller.ts @src/models/user.model.ts
Following the patterns described in CLAUDE.md and the existing code in these files, implement:
1. A new `GET /users/:id/preferences` route in the users router
2. A `getUserPreferences` method in the user controller
3. A `preferences` field added to the User model (JSON type, nullable, default null)

Follow the same error handling pattern used in the existing controller methods.

Expected output: Complete, working code diff across all three files, consistent with your existing patterns.

Step 6: Validate the output against your conventions.
Ask the AI to self-check its own output.

Review the code you just wrote against the conventions in CLAUDE.md. Are there any violations or inconsistencies with the existing codebase patterns?

Expected output: Either confirmation that the code is consistent, or a list of specific deviations with corrections. This closes the loop and catches drift before it enters your codebase.

Key Takeaways

AI coding tools build repo maps and symbol graphs to navigate your codebase, but these are compressed approximations — they can miss files, especially those outside normal project structure.
Use @file (Claude Code) or #file (Cursor) to load specific files directly into context for tasks where precision matters.
CLAUDE.md is the highest-leverage context investment you can make: write it once, benefit on every session, and treat it as living documentation committed alongside your code.
.gitignore exclusions propagate to AI indexing by default; use .cursorignore in Cursor to additionally exclude committed files that create noise (generated code, large fixtures, legacy directories).
More context is not always better — irrelevant files consume space and dilute the model's focus. Curate your context the same way you would curate what you put in a code review.