Drift Is Contextual Tech Debt
Vibe coded systems often drift because the context guiding future changes falls out of sync with the code.
The early stages of vibe coding can feel unnaturally smooth.
You describe the app. The agent scaffolds it. You ask for a feature. It appears. You ask for polish. The UI improves. You ask for a database, auth, a dashboard, billing, exports, admin tools, notifications, and suddenly you have a real thing moving on screen.
Then the shape starts to wobble.
The next feature takes longer to steer. The agent adds a second version of a pattern that already exists. It rewrites a file that only needed a small change. It fixes one bug by creating another. It follows instructions from an old note and ignores the convention the codebase has since grown into. The app still runs, but each prompt feels less like directing a collaborator and more like negotiating with accumulated confusion.
That complaint shows up often in AI coding communities. People describe AI-generated codebases that slowly turn into chaos, projects where complexity grows too fast to maintain, and workflows where the only way to stay sane is to keep refactoring, documenting, and tightening the brief. The pattern is not hard to recognize: output was cheap, so the system accumulated more of it than its context could support.
I think the useful name for this is drift.
More specifically: drift is contextual tech debt.
What Drift Means
Drift is the gap between the context being used to change a system and the system that actually exists.
That context might be a prompt. It might be a README. It might be a planning document, an architecture note, a test, an example component, a previous chat summary, a file name, a directory structure, or a memory the tool is carrying from earlier work. Agents do not work from code alone. They work from a map of the project.
Drift happens when the map falls out of sync with the terrain.
Sometimes that looks like stale documentation. Sometimes it looks like duplicate instructions. Sometimes it is a generated file that still implies an old architecture. Sometimes it is a test that encodes behavior nobody wants anymore. Sometimes it is a “temporary” workaround that became the pattern because the agent saw it three more times.
The visible symptom might be a bug. The underlying cause is broader. The system is being changed from a distorted context.
That is why vibe coded systems are especially vulnerable. The workflow rewards local momentum. Ask for a change, get a plausible diff, run the app, patch the next thing, keep moving. Each step can be reasonable in isolation. The debt accumulates in the relationships between the steps.
Eventually the project has too many half-current truths.
Contextual Tech Debt
Traditional tech debt is the future cost of a present shortcut in code, architecture, tests, or operations.
Contextual tech debt is the future cost of a present shortcut in explanation.
It is what happens when the code changes but the surrounding story does not. The plan says one thing. The files imply another. The old prompt says to use a helper that has since been replaced. The README describes the happy path from three releases ago. A “source of truth” file was copied, edited, and left beside the original. A model sees all of it and has to guess what matters.
Humans do this too. A developer joining a messy project can lose days to obsolete docs and contradictory conventions. The difference with agents is speed. They can absorb bad context quickly and act on it confidently.
This makes contextual tech debt feel like model failure.
Sometimes it is. Models make mistakes. But if the repo is full of stale instructions, abandoned experiments, duplicated patterns, and historical notes that still look current, the agent is being asked to reason inside a noisy operating environment. It may not be getting worse. The project may be getting harder to read.
The practical question is not “How do I write the perfect prompt?”
The better question is: what context is the system allowed to believe?
Memory Can Drift Too
This is one reason I usually want Codex-style tools to start from the repo instead of hidden memory.
Memory can be useful. If a stable preference applies across nearly every session, or if a user has a recurring workflow need, remembering it can reduce friction. But memory is also context, and context can drift.
The harder problem is that memory drift can be difficult to inspect. When a stale README is wrong, you can open the file, see the bad claim, and fix it. When an old architecture note is misleading, you can update it in a commit. Hidden memory is different. It may change the agent’s starting point before the repo has had a chance to speak for itself.
That makes failures harder to diagnose. Did the agent follow the current code? Did it infer from an old file? Did the prompt imply the wrong thing? Or did it carry forward some remembered preference that no longer applies?
This is why “memory off by default” is not a purity rule. It is a drift-control rule. Turn memory on when the need is clear enough to justify the hidden state. Otherwise, let the repo be the memory, because repo context is visible, reviewable, and deletable.
Structure Beats More Context
The common failure mode is trying to fix drift by adding more context.
The agent missed the architecture, so you add a longer instruction. It repeated a mistake, so you add a warning. It forgot a requirement, so you add another note. After a while the project has a stack of corrective prose, and much of it was written in reaction to one incident.
More context can help, but only when it is structured.
A healthy AI-assisted repo should make a few things obvious:
- What is current.
- What is historical.
- What is deprecated.
- Which files are authoritative.
- Which notes are exploratory.
- Which decisions are settled.
- Which constraints still matter.
That does not require ceremony for its own sake. It requires places with jobs.
Product intent belongs somewhere durable. Architecture decisions belong somewhere reviewable. Coding conventions belong somewhere agents are told to read. Active TODOs should not be mixed with old brainstorms. Rejected approaches should be marked as rejected, not left around as plausible alternatives.
The goal is to reduce ambiguity. When the next agent opens the repo, it should not have to infer authority from vibes.
Append or Merge
My preferred rule is simple: append or merge as much as possible.
Append when chronology matters. A running decision log, an incident note, a release journal, or a research trail can benefit from being additive. New information belongs at the end because the sequence itself is useful.
Merge when the current truth changes. If the architecture moved from one routing pattern to another, the current architecture note should be updated. If a convention changed, the convention file should be rewritten. If a product requirement is no longer true, it should be removed or clearly marked historical.
The dangerous habit is scattering.
Scattering creates a new note for every new thought, a new instruction for every mistake, and a new prompt file for every workflow. It feels organized because the files have names. But the system now has more surfaces that can become stale.
Every new context file should have a reason to exist. If it does not have a durable purpose, merge it into the file that already owns that kind of knowledge. If it contradicts something current, reconcile the contradiction instead of leaving both versions for the next agent to discover.
This is not about making docs pretty. It is about controlling what future work will treat as true.
Review for Context, Not Just Code
Most code reviews ask whether the change works.
Context review asks whether the change left the project easier or harder to understand.
That review can be lightweight. After a burst of AI-assisted work, look for the residue:
- Did the agent create a parallel helper instead of extending the existing one?
- Did a prompt-specific workaround become a general pattern?
- Did the implementation change without updating the relevant note?
- Did a planning file stop matching the code?
- Did a test get rewritten to match accidental behavior?
- Did three files become partial sources of truth for the same decision?
The fix is often boring: merge files, delete obsolete notes, rename vague docs, collapse working notes into stable docs, and mark historical material as historical.
Boring is fine. Most maintenance is boring until the day it saves the project.
Treat context cleanup like dependency updates or test cleanup. It does not need to happen after every prompt, but it does need to happen before drift becomes the default working condition.
Contextual Garbage Collection
Another way to think about this is contextual garbage collection.
Context naturally accumulates. A living project collects notes, prompts, summaries, plans, comments, tests, TODOs, issue threads, and half-finished explanations. Some of that material remains valuable. Some of it becomes garbage. Not because it was useless when written, but because the project moved on.
The cleanup job is to periodically collect that garbage before future work treats it as current truth.
This should be done carefully. Removing context that is still relevant creates drift in the other direction. If you delete the reason behind a constraint, a future agent may “simplify” the system by removing the constraint. If you erase a rejected approach without preserving the decision, someone may rebuild it. If you collapse nuance into a tidy but incomplete rule, the next change may optimize for the wrong thing.
Good contextual garbage collection has two moves:
- Delete context that is no longer true and no longer useful.
- Preserve context that still explains why the system is shaped the way it is.
Sometimes preservation means keeping the note. Sometimes it means merging the useful part into an architecture decision record, a README, a test, or an AGENTS.md instruction. Sometimes it means marking a file as historical so it stops competing with current guidance.
The point is not a smaller repo for its own sake. The point is a cleaner signal.
Git History Is a Context Tool
Git history is useful here because it gives context a timeline.
When a note looks wrong, the most useful question is often not “Who wrote this?” It is “What was true when this was written?” A line of documentation may have been perfectly accurate for the version of the system it described. The debt appears later, after the implementation changes and the surrounding explanation does not.
That makes Git a useful context audit tool. git blame can show when a suspicious line was last changed, and the surrounding commit can show what problem that line was trying to solve. The point is not blame in the social sense. The point is to recover the historical reason behind a claim before deciding whether to keep, rewrite, or delete it.
For a suspicious instruction in a doc, prompt, config file, or test, the audit questions are:
- What problem was this solving?
- Does that problem still exist?
- Did the code change later without the context changing?
- Was this instruction copied from an experiment?
- Is this line still guidance, or is it archaeology?
Git can also show how a specific line or function evolved over time with git log -L. That is useful when a small piece of context keeps surviving edits around it. A stale sentence can sit untouched for months while the implementation beneath it changes completely.
This is where the repo becomes more than storage. It becomes a memory system with timestamps.
Churn Points to Where the Map Is Weak
Line-level churn analysis can help too.
High churn is not automatically bad. A file can change often because it is important, actively improved, or central to a feature. But churn is a useful triage signal. It tells you where to read carefully.
If the same module keeps receiving small AI patches, maybe the abstraction is wrong. If the same doc changes with every feature, maybe it is trying to serve too many purposes. If the same tests keep being rewritten, maybe the expected behavior is not actually settled. If the same prompt keeps growing warnings, maybe the workflow needs a better structure instead of more caution tape.
Git gives you simple starting points. git log --numstat can show changed line counts by commit. git log --stat can reveal which files keep moving. More advanced tools can summarize hotspots, but the basic idea is enough: find the places with repeated motion, then inspect whether the context around them is still coherent.
Churn does not prove debt. It points to where debt might be hiding.
The Standard Is a Clean Map
Vibe coding does not fail just because AI wrote the code.
It fails when the operator keeps asking for more output without maintaining the context that makes future output safe to accept.
That is the part easy demos hide. A generated feature is not just a feature. It is a change to the future context of the project. It creates examples the model may copy later. It creates files the model may treat as patterns. It creates tests, names, comments, and conventions that shape the next diff.
If those artifacts are allowed to drift, the codebase becomes harder for both humans and agents to steer.
The answer is not to stop using agents. The answer is to treat context as part of the system.
Structure it. Append when history matters. Merge when the current truth changes. Leave memory off unless there is a clear reason to add hidden state. Delete stale notes, but preserve the context that still explains the system. Use Git history to find old assumptions. Use churn to find unstable areas. Periodically review the project with the explicit goal of making the map match the terrain again.
Drift is what happens when the system changes faster than its context is reconciled.
Context cleanup is how you keep the next change from being built on a lie.