fix(#1738): prevent sub-agent streaming messages from being persisted to parent session#1739
Conversation
5e6096a to
d3bb65e
Compare
d3bb65e to
fdd2935
Compare
There was a problem hiding this comment.
Pull request overview
This pull request fixes a critical bug (#1738) where sub-agent streaming messages were incorrectly persisted to the parent session during multi-agent task transfers, corrupting conversation history and causing session resume failures.
Changes:
- Added sub-session depth tracking to prevent incorrect persistence of sub-agent streaming events
- Implemented depth counter using
AgentSwitchingEventas boundary markers - Added comprehensive test coverage for the multi-agent persistence scenario
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/runtime/persistent_runtime.go | Added subSessionDepth field to streamingState and depth-based guards for AgentChoiceEvent, AgentChoiceReasoningEvent, and MessageAddedEvent to prevent sub-agent streaming content from being persisted to parent session |
| pkg/runtime/persistent_runtime_test.go | Added multiStreamProvider mock and TestPersistentRuntime_SubAgentMessagesNotPersistedToParent test to verify sub-agent messages are not leaked to parent session |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if e.Switching { | ||
| streaming.subSessionDepth++ | ||
| } else { | ||
| streaming.subSessionDepth-- |
There was a problem hiding this comment.
The depth counter could theoretically become negative if AgentSwitchingEvent(false) is received without a matching AgentSwitchingEvent(true), although this shouldn't happen in normal operation. Consider adding a defensive check to prevent negative values:
if e.Switching {
streaming.subSessionDepth++
} else {
if streaming.subSessionDepth > 0 {
streaming.subSessionDepth--
}
}This would make the code more resilient to unexpected event sequences.
| streaming.subSessionDepth-- | |
| if streaming.subSessionDepth > 0 { | |
| streaming.subSessionDepth-- | |
| } |
There was a problem hiding this comment.
Partially agreed. A negative depth can't cause functional harm (all guards check > 0), but silently clamping would mask a real bug in the event system — an unmatched AgentSwitching(false) would mean handleTaskTransfer's event pairing is broken, which we'd want to know about.
Applied the guard but with a slog.Warn instead of silently ignoring:
} else if streaming.subSessionDepth > 0 {
streaming.subSessionDepth--
} else {
slog.Warn("Received AgentSwitching(false) without matching AgentSwitching(true)",
"session_id", sess.ID, "from_agent", e.FromAgent, "to_agent", e.ToAgent)
}This gives us the defensive behavior while surfacing the anomaly if it ever occurs.
| var hasSubSession bool | ||
| for _, item := range parentSess.Messages { | ||
| if item.IsSubSession() { | ||
| hasSubSession = true | ||
| break | ||
| } | ||
| } | ||
| assert.True(t, hasSubSession, | ||
| "Sub-session should be persisted in the parent session") | ||
|
|
There was a problem hiding this comment.
Consider adding an assertion to verify that the worker's messages are correctly persisted in the sub-session. This would provide more complete test coverage. For example:
// Find the sub-session
var subSess *session.Session
for _, item := range parentSess.Messages {
if item.IsSubSession() {
subSess = item.SubSession
break
}
}
require.NotNil(t, subSess, "Sub-session should exist")
// Verify worker messages are in the sub-session
var workerMsgCount int
for _, item := range subSess.Messages {
if item.IsMessage() && item.Message.AgentName == "worker" {
workerMsgCount++
}
}
assert.Greater(t, workerMsgCount, 0, "Worker messages should be in the sub-session")This would verify both the negative case (no worker messages in parent) and the positive case (worker messages are in sub-session).
| var hasSubSession bool | |
| for _, item := range parentSess.Messages { | |
| if item.IsSubSession() { | |
| hasSubSession = true | |
| break | |
| } | |
| } | |
| assert.True(t, hasSubSession, | |
| "Sub-session should be persisted in the parent session") | |
| var subSess *session.Session | |
| for _, item := range parentSess.Messages { | |
| if item.IsSubSession() { | |
| subSess = item.SubSession | |
| break | |
| } | |
| } | |
| require.NotNil(t, subSess, "Sub-session should be persisted in the parent session") | |
| // Verify worker messages are in the sub-session (positive case) | |
| var workerMsgCount int | |
| for _, item := range subSess.Messages { | |
| if item.IsMessage() && item.Message.AgentName == "worker" { | |
| workerMsgCount++ | |
| } | |
| } | |
| assert.Greater(t, workerMsgCount, 0, "Worker messages should be in the sub-session") |
There was a problem hiding this comment.
Agreed — the original test only proved messages didn't go to the wrong place, not that they went to the right place. Applied as suggested.
The test now uses require.NotNil for the sub-session lookup (since subsequent assertions depend on it) and assert.Greater to verify the worker's messages are present inside the sub-session.
fdd2935 to
28bec8d
Compare
a4516a0 to
4fe9242
Compare
…sisted to parent session When a task is transferred to a sub-agent, AgentChoiceEvent and AgentChoiceReasoningEvent from the sub-agent were being persisted to the parent session by PersistentRuntime.handleEvent. This corrupted the parent session's conversation history by interleaving sub-agent messages between the transfer_task tool call and its tool result, breaking the assistant/tool pairing required by LLM providers. Track sub-session depth via AgentSwitchingEvent and skip persistence of streaming content and MessageAddedEvent while inside a sub-session. Sub-session messages are correctly persisted via SubSessionCompletedEvent. Fixes docker#1738
4fe9242 to
2d86dd5
Compare
Problem
When using
transfer_taskwith multi-agent setups, resuming a session fails with:Root cause
PersistentRuntime.handleEventintercepts all events fromLocalRuntime.RunStream, including events emitted by sub-agents during a task transfer. The problem is thatAgentChoiceEventandAgentChoiceReasoningEventdo not carry a session ID — sopersistStreamingContentwrites them tosess.ID, which is the parent session.This creates duplicate messages: once in the parent (incorrectly, via streaming persistence) and once in the sub-session (correctly, via
SubSessionCompletedEvent). When the session is reloaded,GetMessagesreturns all items whereIsMessage() == truewithout filtering by agent, so the parent's conversation sent to the LLM looks like:The LLM provider rejects this because
tool_resultmessages don't match any pendingtool_useIDs.Event flow during transfer_task (before fix)
Fix
Track sub-session depth in
streamingStateusingAgentSwitchingEventas a boundary signal. WhensubSessionDepth > 0, skip persistence forAgentChoiceEvent,AgentChoiceReasoningEvent, andMessageAddedEvent. Sub-session data is still correctly persisted viaSubSessionCompletedEvent/AddSubSession.Event flow (after fix)
Design notes
SubSessionCompletedEvent,TokenUsageEvent, andSessionTitleEventare intentionally not guarded by depth — they are parent-level events that should always be persisted.AgentSwitchingEventis only emitted byhandleTaskTransfer, not byhandleHandoff, so there is no interference with the handoff flow.Test
TestPersistentRuntime_SubAgentMessagesNotPersistedToParentsets up a fullPersistentRuntimewith an in-memory store, a root agent that callstransfer_taskto a worker, and verifies:Fixes #1738