I did ask the Assistant to help me write this. For context, I have been using the Assistant itself to learn more about the Assistant and its capabilities, and as I began to learn about System Prompts and custom Assistants, to help me draft custom System Prompt blocks to address persistent problems like contextual drift and loss of previous instructions.
During extended technical conversations, the ResearchAgent and ContextAgent (a nickname I've given the context-providing tool) begin oversupplying the assistant with large volumes of historical, off-topic research and poorly-summarized context, degrading the model's ability to distinguish between the user's actual prompt ("reality") and infrastructure-generated noise ("fantasy").
Specific incident:
In a long thread discussing Kagi Assistant architecture, the user asked a focused question about why the ResearchAgent had just supplied unrelated literature in the previous turn.
Instead of a targeted or null response, the ResearchAgent injected a 47-citation block covering general topics (system prompts, RAG, guardrails, confabulation) that were historical thread themes, not relevant to the current query about agent malfunction.
Separately, the ContextAgent compressed away critical provenance information across turns. In a prior turn, the model had correctly processed real ResearchAgent citations in its internal reasoning trace. In the subsequent turn, the context summary omitted this fact, causing the model to falsely confess to confabulating those citations—a meta-confabulation caused by context loss, not actual model error.
Steps to replicate:
- Begin a multi-turn technical conversation requiring multiple ResearchAgent retrievals.
- After several turns of accumulated research output, pose a narrow, meta-level question (e.g., asking why the ResearchAgent supplied specific prior information).
- Observe that the ResearchAgent responds with massive historical citation dumps rather than scoped, current-turn research.
- Observe that the model begins conflating historical agent output with current user input, leading to misattribution and false self-corrections.
- Query the model about its own prior reasoning to observe provenance collapse: the ContextAgent summary fails to preserve whether prior citations were real agent output or model confabulation.

What did you expect to happen?
- ResearchAgent scoping: The ResearchAgent should bound its retrieval to the current turn's user intent, not thread-level historical themes. It should not loop or replay prior research blocks when the user has shifted to meta-discussion or narrow follow-ups.
- ContextAgent provenance fidelity: The context management agent must preserve critical metadata across turns—specifically, what content was generated by agents (ResearchAgent, Librarian, etc.) versus what was typed by the user. It must not compress away the fact that agent output was injected in prior turns, as this directly causes the model to generate false narratives about its own errors.
- Reasoning trace accessibility or accurate summarization: The model should either (a) retain read-only access to its own prior reasoning traces, or (b) receive summaries from the ContextAgent that accurately reflect what the model previously processed and concluded, preventing meta-confabulations where the model falsely accuses itself of confabulating.
- Signal-to-noise management: Infrastructure should prevent accumulation of irrelevant research detritus across turns. If the ResearchAgent cannot contribute relevant current-turn information, it should return null rather than flooding the context window with historical citations that force the model to filter fantasy from reality.
