Skip to content

fix(#1738): prevent sub-agent streaming messages from being persisted to parent session#1739

Draft
aheritier wants to merge 1 commit intodocker:mainfrom
aheritier:fix/1738-subsession-streaming-persistence
Draft

fix(#1738): prevent sub-agent streaming messages from being persisted to parent session#1739
aheritier wants to merge 1 commit intodocker:mainfrom
aheritier:fix/1738-subsession-streaming-persistence

Conversation

@aheritier
Copy link
Contributor

@aheritier aheritier commented Feb 15, 2026

Problem

When using transfer_task with multi-agent setups, resuming a session fails with:

all models failed: tool_result messages present but none match pending tool_use ids (beta converter)

Root cause

PersistentRuntime.handleEvent intercepts all events from LocalRuntime.RunStream, including events emitted by sub-agents during a task transfer. The problem is that AgentChoiceEvent and AgentChoiceReasoningEvent do not carry a session ID — so persistStreamingContent writes them to sess.ID, which is the parent session.

This creates duplicate messages: once in the parent (incorrectly, via streaming persistence) and once in the sub-session (correctly, via SubSessionCompletedEvent). When the session is reloaded, GetMessages returns all items where IsMessage() == true without filtering by agent, so the parent's conversation sent to the LLM looks like:

assistant  (root)    → tool_calls: [transfer_task]     ← expects matching tool result next
assistant  (worker)  → "doing work..."                  ← ORPHANED, breaks the pairing
assistant  (worker)  → tool_calls: [shell]              ← ORPHANED, no matching tool result
...
subsession           →                                  ← skipped by GetMessages
tool       (root)    → result for transfer_task         ← LLM can't match this anymore

The LLM provider rejects this because tool_result messages don't match any pending tool_use IDs.

Event flow during transfer_task (before fix)

handleTaskTransfer() {
    evts <- AgentSwitching(true)

    for event := range r.RunStream(ctx, subSession) {   // LocalRuntime.RunStream
        evts <- event                                    // forwarded to parent's channel
    }
    // These sub-agent events hit PersistentRuntime.handleEvent with sess = parent:
    //   AgentChoiceEvent    → persistStreamingContent(parent.ID) ← BUG
    //   MessageAddedEvent   → addMessage(subSession.ID)          ← correct ID but streaming
    //                                                               message already written
    //                                                               to parent

    evts <- SubSessionCompleted(parent.ID, subSession)
    evts <- AgentSwitching(false)
}

Fix

Track sub-session depth in streamingState using AgentSwitchingEvent as a boundary signal. When subSessionDepth > 0, skip persistence for AgentChoiceEvent, AgentChoiceReasoningEvent, and MessageAddedEvent. Sub-session data is still correctly persisted via SubSessionCompletedEvent / AddSubSession.

Event flow (after fix)

Event                          depth   Action
─────────────────────────────  ─────   ──────────────────────────
AgentSwitching(true)           0 → 1
AgentChoiceEvent(worker)       1       SKIP (sub-agent streaming)
MessageAddedEvent(worker)      1       SKIP (sub-agent message)
SubSessionCompletedEvent       1       PERSIST (always, this is the parent's record)
AgentSwitching(false)          1 → 0
MessageAddedEvent(root/tool)   0       PERSIST (tool result for transfer_task)

Design notes

  • Uses a depth counter (not a boolean) to correctly handle nested transfers (sub-agent delegating to its own sub-agent).
  • SubSessionCompletedEvent, TokenUsageEvent, and SessionTitleEvent are intentionally not guarded by depth — they are parent-level events that should always be persisted.
  • AgentSwitchingEvent is only emitted by handleTaskTransfer, not by handleHandoff, so there is no interference with the handoff flow.

Test

TestPersistentRuntime_SubAgentMessagesNotPersistedToParent sets up a full PersistentRuntime with an in-memory store, a root agent that calls transfer_task to a worker, and verifies:

  1. No worker messages in the parent session
  2. The sub-session reference is persisted
  3. The root's assistant message (with the tool call) and tool result are both present

Fixes #1738

@aheritier aheritier force-pushed the fix/1738-subsession-streaming-persistence branch from 5e6096a to d3bb65e Compare February 15, 2026 16:32
@aheritier aheritier marked this pull request as ready for review February 15, 2026 16:45
@aheritier aheritier requested a review from a team as a code owner February 15, 2026 16:45
Copilot AI review requested due to automatic review settings February 15, 2026 16:45
@aheritier aheritier force-pushed the fix/1738-subsession-streaming-persistence branch from d3bb65e to fdd2935 Compare February 15, 2026 16:47
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a critical bug (#1738) where sub-agent streaming messages were incorrectly persisted to the parent session during multi-agent task transfers, corrupting conversation history and causing session resume failures.

Changes:

  • Added sub-session depth tracking to prevent incorrect persistence of sub-agent streaming events
  • Implemented depth counter using AgentSwitchingEvent as boundary markers
  • Added comprehensive test coverage for the multi-agent persistence scenario

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/runtime/persistent_runtime.go Added subSessionDepth field to streamingState and depth-based guards for AgentChoiceEvent, AgentChoiceReasoningEvent, and MessageAddedEvent to prevent sub-agent streaming content from being persisted to parent session
pkg/runtime/persistent_runtime_test.go Added multiStreamProvider mock and TestPersistentRuntime_SubAgentMessagesNotPersistedToParent test to verify sub-agent messages are not leaked to parent session

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if e.Switching {
streaming.subSessionDepth++
} else {
streaming.subSessionDepth--
Copy link

Copilot AI Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The depth counter could theoretically become negative if AgentSwitchingEvent(false) is received without a matching AgentSwitchingEvent(true), although this shouldn't happen in normal operation. Consider adding a defensive check to prevent negative values:

if e.Switching {
    streaming.subSessionDepth++
} else {
    if streaming.subSessionDepth > 0 {
        streaming.subSessionDepth--
    }
}

This would make the code more resilient to unexpected event sequences.

Suggested change
streaming.subSessionDepth--
if streaming.subSessionDepth > 0 {
streaming.subSessionDepth--
}

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially agreed. A negative depth can't cause functional harm (all guards check > 0), but silently clamping would mask a real bug in the event system — an unmatched AgentSwitching(false) would mean handleTaskTransfer's event pairing is broken, which we'd want to know about.

Applied the guard but with a slog.Warn instead of silently ignoring:

} else if streaming.subSessionDepth > 0 {
    streaming.subSessionDepth--
} else {
    slog.Warn("Received AgentSwitching(false) without matching AgentSwitching(true)",
        "session_id", sess.ID, "from_agent", e.FromAgent, "to_agent", e.ToAgent)
}

This gives us the defensive behavior while surfacing the anomaly if it ever occurs.

Comment on lines 111 to 120
var hasSubSession bool
for _, item := range parentSess.Messages {
if item.IsSubSession() {
hasSubSession = true
break
}
}
assert.True(t, hasSubSession,
"Sub-session should be persisted in the parent session")

Copy link

Copilot AI Feb 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding an assertion to verify that the worker's messages are correctly persisted in the sub-session. This would provide more complete test coverage. For example:

// Find the sub-session
var subSess *session.Session
for _, item := range parentSess.Messages {
    if item.IsSubSession() {
        subSess = item.SubSession
        break
    }
}
require.NotNil(t, subSess, "Sub-session should exist")

// Verify worker messages are in the sub-session
var workerMsgCount int
for _, item := range subSess.Messages {
    if item.IsMessage() && item.Message.AgentName == "worker" {
        workerMsgCount++
    }
}
assert.Greater(t, workerMsgCount, 0, "Worker messages should be in the sub-session")

This would verify both the negative case (no worker messages in parent) and the positive case (worker messages are in sub-session).

Suggested change
var hasSubSession bool
for _, item := range parentSess.Messages {
if item.IsSubSession() {
hasSubSession = true
break
}
}
assert.True(t, hasSubSession,
"Sub-session should be persisted in the parent session")
var subSess *session.Session
for _, item := range parentSess.Messages {
if item.IsSubSession() {
subSess = item.SubSession
break
}
}
require.NotNil(t, subSess, "Sub-session should be persisted in the parent session")
// Verify worker messages are in the sub-session (positive case)
var workerMsgCount int
for _, item := range subSess.Messages {
if item.IsMessage() && item.Message.AgentName == "worker" {
workerMsgCount++
}
}
assert.Greater(t, workerMsgCount, 0, "Worker messages should be in the sub-session")

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — the original test only proved messages didn't go to the wrong place, not that they went to the right place. Applied as suggested.

The test now uses require.NotNil for the sub-session lookup (since subsequent assertions depend on it) and assert.Greater to verify the worker's messages are present inside the sub-session.

@aheritier aheritier force-pushed the fix/1738-subsession-streaming-persistence branch from fdd2935 to 28bec8d Compare February 15, 2026 16:54
Copy link
Contributor Author

@aheritier aheritier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed both comments from Copilot in commit 28bec8d. See individual replies above.

@aheritier aheritier force-pushed the fix/1738-subsession-streaming-persistence branch 2 times, most recently from a4516a0 to 4fe9242 Compare February 16, 2026 20:37
…sisted to parent session

When a task is transferred to a sub-agent, AgentChoiceEvent and
AgentChoiceReasoningEvent from the sub-agent were being persisted to
the parent session by PersistentRuntime.handleEvent. This corrupted
the parent session's conversation history by interleaving sub-agent
messages between the transfer_task tool call and its tool result,
breaking the assistant/tool pairing required by LLM providers.

Track sub-session depth via AgentSwitchingEvent and skip persistence
of streaming content and MessageAddedEvent while inside a sub-session.
Sub-session messages are correctly persisted via
SubSessionCompletedEvent.

Fixes docker#1738
@aheritier aheritier force-pushed the fix/1738-subsession-streaming-persistence branch from 4fe9242 to 2d86dd5 Compare February 16, 2026 20:45
@aheritier aheritier marked this pull request as draft February 16, 2026 22:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PersistentRuntime writes sub-agent streaming messages to parent session, corrupting conversation history

1 participant