wip agent activity interruption detector integration #991

lukasIO · 2026-01-22T14:16:32Z

Description

Changes Made

Pre-Review Checklist

Build passes: All builds (lint, typecheck, tests) pass locally
AI-generated code reviewed: Removed unnecessary comments and ensured code quality
Changes explained: All changes are properly documented and justified above
Scope appropriate: All changes relate to the PR title, or explanations provided for why they're included
Video demo: A small video demo showing changes works as expected and did not break any existing functionality using Agent Playground (if applicable)

Testing

Automated tests added/updated (if applicable)
All tests pass
Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes

Note to reviewers: Please ensure the pre-review checklist is completed before starting your review.

Summary by CodeRabbit

Release Notes v1.0.38

New Features
- Added noise cancellation frame processors support
- Added aligned transcript support with word-level timing for STT plugins
- Added stream input support to combine multiple input streams
- Introduced turn handling configuration for improved agent setup
Improvements
- Refined timestamp handling in spans and recording alignment
- Enhanced interruption detection capabilities with public API updates
- Fixed Google realtime generation sequencing to prevent timeouts
Breaking Changes
- Agent constructor now accepts options object instead of destructured parameters
- InterruptionStream sentinel methods renamed (speechStarted → agentSpeechStarted, speechEnded → agentSpeechEnded)

_{✏️ Tip: You can customize this high-level summary in your review settings.}

changeset-bot · 2026-01-22T14:16:41Z

🦋 Changeset detected

Latest commit: 157fcf5

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 18 packages

Name	Type
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cartesia	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugins-test	Patch
@livekit/agents-plugin-xai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

coderabbitai · 2026-01-22T14:16:45Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This PR releases version 1.0.38 by consolidating changeset entries into official changelogs and introducing significant updates: new interruption detection integration with AdaptiveInterruptionDetector, a TurnHandlingConfig path to the Agent constructor, public method renames in InterruptionStream, stream input handling via addStreamInput, Google realtime generation sequencing improvements with debug logging, and comprehensive refactoring of interruption state management across agent activity and audio recognition components.

Changes

Cohort / File(s)	Summary
Version bumps and changelog consolidation `.changeset/large-cars-pull.md`, `.changeset/lazy-spies-worry.md`, `.changeset/tender-glasses-burn.md`, `agents/CHANGELOG.md`, `agents/package.json`, `plugins//CHANGELOG.md`, `plugins//package.json`	Removes four changeset markdown files and consolidates their entries into official CHANGELOG.md files. Bumps all package versions from 1.0.37 to 1.0.38, documenting three patch changes: noiseCancellation support, timestamp refinements, and aligned transcript support with word-level timing.
Interruption stream and type changes `agents/src/inference/interruption/InterruptionStream.ts`, `agents/src/inference/interruption/types.ts`	Renames sentinel methods: `speechStarted()` → `agentSpeechStarted()` and `speechEnded()` → `agentSpeechEnded()`. Makes `overlapSpeechStarted` parameters optional: `speechDurationInS` and `userSpeakingSpan` become `speechDurationInS?` and `userSpeakingSpan?`. Updates `OverlapSpeechStarted` interface to match.
Stream channel API extension `agents/src/stream/stream_channel.ts`	Adds public `addStreamInput(stream: ReadableStream<T>)` method to `StreamChannel` interface and implementation, enabling multiple input streams to be piped into a single channel via asynchronous reader/writer mechanics.
Agent configuration refactoring `agents/src/voice/agent.ts`	Extends `AgentOptions` with new optional `turnHandling?: TurnHandlingConfig` field. Refactors Agent constructor to accept single `options` object instead of destructured parameters, mapping legacy `turnDetection` into new `turnHandling` path. Adds public getters `interruptionDetection()` and `allowInterruptions()`.
Agent activity interruption integration `agents/src/voice/agent_activity.ts`	Introduces `AdaptiveInterruptionDetector` integration with extensive state management (interruption detector, detection flags, audio activity control). Extends lifecycle methods (`onInputSpeechStarted`, `onInterimTranscript`, `onFinalTranscript`, etc.) to coordinate interruption detection, transcript buffering, and audio activity state. Adds `resolveInterruptionDetector()` and `restoreInterruptionByAudioActivity()` helpers. Updates `updateOptions()` to accept `turnDetection` parameter.
Audio recognition interruption support `agents/src/voice/audio_recognition.ts`	Adds `AdaptiveInterruptionDetector` support with new `InterruptionStream` channel for interruption signaling. Extends `RecognitionHooks` with `onInterruption` callback and `AudioRecognitionOptions` with `interruptionDetection` option. Implements transcript buffering/holding logic via `shouldHoldSttEvent()` and `flushHeldTranscripts()`. Adds `onInterruptionEvent()` handler and proper channel cleanup.
Google realtime generation logic `plugins/google/src/beta/realtime/realtime_api.ts`	Introduces `LK_GOOGLE_DEBUG` environment flag for conditional verbose logging. Adds `isNewGeneration()` helper to determine generation start based on tool calls or server content. Refactors generation-start logic and turnComplete handling, improving interrupted serverContent processing with debug-gated conditional logging.
Environment configuration `turbo.json`	Adds `"LK_GOOGLE_DEBUG"` to global environment variables list.
New changeset entry `.changeset/shy-rockets-fry.md`	Adds single patch entry documenting state transition fixes and interim transcript interruption support.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent
    participant AgentActivity as AgentActivity
    participant AudioRec as AudioRecognition
    participant IntDetector as AdaptiveInterruptionDetector
    participant IntStream as InterruptionStream
    participant STT as STT/VAD
    
    Agent->>AgentActivity: init with turnHandling config
    AgentActivity->>IntDetector: resolveInterruptionDetector()
    IntDetector-->>AgentActivity: detector instance
    AgentActivity->>AudioRec: start() + interruptionDetection
    AudioRec->>IntStream: create interruption channel
    
    STT->>AudioRec: onInputSpeechStarted()
    AudioRec->>IntDetector: agent speech event
    IntDetector->>IntStream: emit agentSpeechStarted sentinel
    IntStream-->>AudioRec: interruption signal
    
    STT->>AudioRec: onInterimTranscript()
    AudioRec->>AudioRec: shouldHoldSttEvent? (check ignoreUserTranscript)
    alt Interruption Active
        AudioRec->>AudioRec: buffer transcript
    else Normal Flow
        AudioRec->>AgentActivity: onInterimTranscript()
    end
    
    IntDetector->>IntStream: overlap speech detected
    IntStream-->>AudioRec: OverlapSpeechStarted
    AudioRec->>AgentActivity: onInterruption(event)
    AgentActivity->>AgentActivity: adjust interruption state
    
    STT->>AudioRec: onInputSpeechStopped()
    AudioRec->>IntDetector: agent speech end
    IntDetector->>IntStream: emit agentSpeechEnded sentinel
    AudioRec->>AudioRec: flushHeldTranscripts()
    AudioRec->>AgentActivity: emit buffered transcripts

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR introduces substantial logic across multiple interconnected systems: interruption detection integration in two key files (agent_activity.ts, audio_recognition.ts) with 381+ net new lines, public API changes (method renames, new optional parameters), constructor refactoring in agent.ts, Google realtime generation logic improvements, and version management across 20+ packages. The heterogeneity of concerns—interruption state machine, stream handling, configuration migration, and plugin-specific fixes—combined with moderate logic density in lifecycle event coordination requires careful cross-component reasoning.

Possibly related PRs

fix: Fix Google realtime generation lifecycle to fix "generateReply timed out waiting for generation_created" event (parity with Python plugin) #989: Modifies Google realtime generation sequencing and adds LK_GOOGLE_DEBUG environment variable gating—directly overlaps with plugins/google/src/beta/realtime/realtime_api.ts and turbo.json changes in this PR.
Agent state fixes and interim transcript interruption support #992: Refactors agents/src/voice/agent_activity.ts interruption handling and state transitions—shares core interruption logic and interim transcript flow changes with this PR's AgentActivity modifications.

Suggested reviewers

toubatbrian
theomonnom

Poem

🐰 Whisker-twitching, paws aligned,
Interruptions now refined!
Streams flow swift through channels true,
Turn by turn, the agents brew—
Version hops from 37 to 38's view! 🌟

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is entirely empty—only template placeholders are present with no actual content filling the required sections (Description, Changes Made, Pre-Review Checklist results, Testing results, Additional Notes).	Complete the PR description by adding a clear description of changes, listing modifications related to interruption detector integration, checking pre-review items, documenting testing, and providing any relevant context.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title indicates a work-in-progress feature addition for agent activity interruption detector integration, which aligns with the code changes showing interruption detection implementation across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…imed out waiting for generation_created" event (parity with Python plugin) (#989)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

lukasIO · 2026-01-23T16:21:18Z

@coderabbitai review

coderabbitai · 2026-01-23T16:21:28Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 10

🤖 Fix all issues with AI agents

In `@agents/src/inference/interruption/InterruptionStream.ts`:
- Around line 50-55: The overlap-speech-started factory may emit undefined
speechDurationInS which later leads to NaN in the
Math.round(chunk.speechDurationInS * this.options.sampleRate) calculation;
update overlapSpeechStarted (and/or the consumer before multiplication) to
ensure speechDurationInS is a finite number (e.g., coalesce undefined to 0 or
validate with Number.isFinite) so the Math.round call always receives a numeric
value; specifically, adjust the overlapSpeechStarted return value or the code
that reads chunk.speechDurationInS so that chunk.speechDurationInS is a safe
numeric value before calling Math.round in the consumer that references
this.options.sampleRate.

In `@agents/src/inference/interruption/types.ts`:
- Around line 67-71: The optional fields on OverlapSpeechStarted
(speechDurationInS and userSpeakingSpan) are used without guards in
InterruptionStream.ts causing NaN in calculations; update the places that
compute Math.round(chunk.speechDurationInS * this.options.sampleRate) and
Date.now() - frame.speechDurationInS * 1000 to first check that
chunk.speechDurationInS / frame.speechDurationInS is not null/undefined (or
coerce to a safe default like 0) and handle the absent-case (e.g., early-return,
skip the branch, or use 0 as a fallback) so arithmetic only runs on defined
numeric values; locate references to OverlapSpeechStarted,
chunk.speechDurationInS, frame.speechDurationInS and this.options.sampleRate to
apply the guard.

In `@agents/src/stream/stream_channel.ts`:
- Around line 28-41: The addStreamInput async IIFE currently swallows errors and
writes without checking channel state; update the loop in addStreamInput (the
async IIFE using newInputStream.getReader() and await writer.write(value)) to
guard writes with a channel-closed check (e.g., if (writer.isClosed || isClosed)
break/skip) and wrap the read/write body in a try/catch so any error from
writer.write(...) or reader.read() is handled: on error release the reader, call
writer.abort(error) or writer.close() as appropriate, and surface/log the error
to avoid unhandled promise rejections; keep the existing finally block that
calls reader.releaseLock() so the reader is always released.

In `@agents/src/voice/agent_activity.ts`:
- Around line 1673-1678: The call to audioRecognition.onEndOfAgentSpeech uses an
object but the method signature expects a number; change the invocation inside
the isInterruptionDetectionEnabled block so that onEndOfAgentSpeech receives a
numeric timestamp (e.g., Date.now()) rather than an object, and keep the
subsequent call to restoreInterruptionByAudioActivity unchanged; update the call
site where audioRecognition.onEndOfAgentSpeech is invoked to pass the number
directly.
- Around line 1374-1378: The call to audioRecognition.OnEndOfAgentSpeech is
using the wrong case and will throw at runtime; change the invocation to
audioRecognition.onEndOfAgentSpeech(Date.now()) where the audioRecognition
object is used (within the block guarded by isInterruptionDetectionEnabled) and
keep the surrounding logic (including restoreInterruptionByAudioActivity) intact
so the only change is the method name casing.

In `@agents/src/voice/agent.ts`:
- Around line 128-130: The code ignores the legacy turnDetection option because
you always read from this.turnHandling; fix by preferring an explicit legacy
option when present: in the constructor where this.turnHandling and
this.turnDetection are set, check the incoming options.turnDetection (or the
destructured turnDetection param) and assign this.turnDetection =
options.turnDetection ?? this.turnHandling?.turnDetection (or if you have a
turnConfig sentinel, only migrate legacy when turnConfig is unset), ensuring the
legacy value is applied when provided; update the assignment site around
this.turnHandling / this.turnDetection to perform this conditional migration.
- Around line 151-154: The code risks a TypeError when turnHandling exists but
interruption is undefined; change references to use optional chaining on
interruption (e.g., read this.turnHandling?.interruption?.mode into
this._interruptionDetection) and update the check/assignment for
this._allowInterruptions to use the same safe chain (e.g., set
this._allowInterruptions = !!this.turnHandling?.interruption?.mode) so you never
access interruption.mode directly; update both occurrences (the assignment to
this._interruptionDetection and the conditional that sets
this._allowInterruptions) to use interruption?.

In `@agents/src/voice/audio_recognition.ts`:
- Around line 279-280: The current condition for building eventsToEmit
incorrectly treats emitFromIndex === 0 as falsy; update the check in
audio_recognition.ts so eventsToEmit is populated when emitFromIndex is a valid
numeric index (including 0) and shouldFlush is true — e.g., replace the truthy
check with an explicit numeric/index validation (like typeof emitFromIndex ===
'number' && emitFromIndex >= 0 && shouldFlush) before calling
this.transcriptBuffer.slice(emitFromIndex).
- Around line 370-371: The condition else if (this.transcriptBuffer) is always
truthy because transcriptBuffer is an array; change the branch to check for
content by using this.transcriptBuffer.length > 0 (or a safe check like
Array.isArray(this.transcriptBuffer) && this.transcriptBuffer.length > 0) before
calling await this.flushHeldTranscripts(); update the condition in the method
that contains transcriptBuffer and flushHeldTranscripts to use this length check
so flushHeldTranscripts() only runs when there are held transcripts.
- Around line 218-227: The guard in onEndOfOverlapSpeech is inverted: change the
early return so the function only bails out when interruption detection is
disabled (i.e. replace the current if (this.isInterruptionEnabled) return; with
if (!this.isInterruptionEnabled) return;), so that
trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded()) is
invoked when interruptions are enabled; keep the existing user speaking span
handling (userSpeakingSpan.isRecording() /
setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false')) as-is so it still runs
before sending the sentinel.

🧹 Nitpick comments (1)

agents/src/voice/agent_activity.ts (1)

2453-2496: Complex negated condition in resolveInterruptionDetector.

The condition at lines 2456-2465 is hard to follow due to the outer negation. Consider restructuring for clarity.

♻️ Suggested refactor for readability

+  private isInterruptionDetectionCompatible(): boolean {
+    return (
+      !!this.stt &&
+      this.stt.capabilities.alignedTranscript &&
+      this.stt.capabilities.streaming &&
+      !!this.vad &&
+      this.turnDetection !== 'manual' &&
+      this.turnDetection !== 'realtime_llm' &&
+      !(this.llm instanceof RealtimeModel)
+    );
+  }

   private resolveInterruptionDetector(): AdaptiveInterruptionDetector | undefined {
     const interruptionDetection =
       this.agent.interruptionDetection ?? this.agentSession.interruptionDetection;
-    if (
-      !(
-        this.stt &&
-        this.stt.capabilities.alignedTranscript &&
-        this.stt.capabilities.streaming &&
-        this.vad &&
-        this.turnDetection !== 'manual' &&
-        this.turnDetection !== 'realtime_llm' &&
-        !(this.llm instanceof RealtimeModel)
-      )
-    ) {
+    if (!this.isInterruptionDetectionCompatible()) {

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d32c202 and 254cc7f.

📒 Files selected for processing (44)

.changeset/large-cars-pull.md
.changeset/lazy-spies-worry.md
.changeset/shy-rockets-fry.md
.changeset/tender-glasses-burn.md
agents/CHANGELOG.md
agents/package.json
agents/src/inference/interruption/InterruptionStream.ts
agents/src/inference/interruption/types.ts
agents/src/stream/stream_channel.ts
agents/src/voice/agent.ts
agents/src/voice/agent_activity.ts
agents/src/voice/audio_recognition.ts
plugins/anam/CHANGELOG.md
plugins/anam/package.json
plugins/baseten/package.json
plugins/bey/CHANGELOG.md
plugins/bey/package.json
plugins/cartesia/CHANGELOG.md
plugins/cartesia/package.json
plugins/deepgram/CHANGELOG.md
plugins/deepgram/package.json
plugins/elevenlabs/CHANGELOG.md
plugins/elevenlabs/package.json
plugins/google/CHANGELOG.md
plugins/google/package.json
plugins/google/src/beta/realtime/realtime_api.ts
plugins/hedra/package.json
plugins/inworld/package.json
plugins/livekit/CHANGELOG.md
plugins/livekit/package.json
plugins/neuphonic/CHANGELOG.md
plugins/neuphonic/package.json
plugins/openai/CHANGELOG.md
plugins/openai/package.json
plugins/resemble/CHANGELOG.md
plugins/resemble/package.json
plugins/rime/CHANGELOG.md
plugins/rime/package.json
plugins/silero/CHANGELOG.md
plugins/silero/package.json
plugins/test/CHANGELOG.md
plugins/test/package.json
plugins/xai/package.json
turbo.json

💤 Files with no reviewable changes (3)

.changeset/large-cars-pull.md
.changeset/lazy-spies-worry.md
.changeset/tender-glasses-burn.md

🧰 Additional context used

📓 Path-based instructions (3)

**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

agents/src/inference/interruption/types.ts
agents/src/stream/stream_channel.ts
plugins/google/src/beta/realtime/realtime_api.ts
agents/src/voice/audio_recognition.ts
agents/src/inference/interruption/InterruptionStream.ts
agents/src/voice/agent_activity.ts
agents/src/voice/agent.ts

**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

agents/src/inference/interruption/types.ts
agents/src/stream/stream_channel.ts
plugins/google/src/beta/realtime/realtime_api.ts
agents/src/voice/audio_recognition.ts
agents/src/inference/interruption/InterruptionStream.ts
agents/src/voice/agent_activity.ts
agents/src/voice/agent.ts

**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

agents/src/inference/interruption/types.ts
agents/src/stream/stream_channel.ts
plugins/google/src/beta/realtime/realtime_api.ts
agents/src/voice/audio_recognition.ts
agents/src/inference/interruption/InterruptionStream.ts
agents/src/voice/agent_activity.ts
agents/src/voice/agent.ts

🧬 Code graph analysis (3)

agents/src/voice/audio_recognition.ts (6)

agents/src/inference/interruption/types.ts (2)

InterruptionEvent (17-28)

InterruptionSentinel (84-89)

agents/src/inference/interruption/AdaptiveInterruptionDetector.ts (1)

AdaptiveInterruptionDetector (26-192)

agents/src/stt/stt.ts (1)

SpeechEvent (65-70)

agents/src/stream/stream_channel.ts (2)

StreamChannel (7-14)

createStreamChannel (16-60)

agents/src/inference/interruption/InterruptionStream.ts (2)

InterruptionSentinel (36-36)

InterruptionStreamSentinel (41-64)

agents/src/stt/index.ts (1)

SpeechEventType (11-11)

agents/src/inference/interruption/InterruptionStream.ts (1)

agents/src/inference/interruption/types.ts (2)

AgentSpeechStarted (59-61)

AgentSpeechEnded (63-65)

agents/src/voice/agent.ts (3)

agents/src/voice/agent_session.ts (1)

TurnDetectionMode (90-90)

agents/src/voice/turn_config/turnHandling.ts (1)

TurnHandlingConfig (9-46)

agents/src/voice/turn_config/interruption.ts (1)

InterruptionConfig (4-37)

🔇 Additional comments (47)

plugins/cartesia/CHANGELOG.md (1)

3-9: Changelog entry looks good.
Clear and consistent with prior version sections.

plugins/hedra/package.json (1)

3-3: Version bump looks good.

No concerns with this change.

plugins/rime/CHANGELOG.md (1)

3-8: Changelog entry looks good.

Clear, consistent dependency bump entry for 1.0.38.

plugins/silero/CHANGELOG.md (1)

3-9: LGTM! Changelog entry follows standard format.

The new version 1.0.38 entry is properly formatted and consistent with previous changelog entries. The dependency update to @livekit/agents@1.0.38 aligns with the broader version bump across the monorepo.

plugins/bey/CHANGELOG.md (1)

3-9: The changelog entry is correct and complete. The bey plugin has no code changes—only a dependency update to @livekit/agents@1.0.38. The format matches the historical pattern across all previous versions (1.0.37, 1.0.36, 1.0.35), which similarly document dependency-only updates. The interruption detection features mentioned in the PR title and AI summary are changes in the core @livekit/agents package itself, not the bey plugin, so the changelog correctly reflects only the dependency update.

Likely an incorrect or invalid review comment.

plugins/rime/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 aligns with the coordinated patch release across the monorepo.

plugins/inworld/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the coordinated release.

agents/package.json (1)

3-3: LGTM!

Core agents package version bump to 1.0.38 aligns with the new interruption detection integration and related features in this PR.

plugins/bey/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the coordinated release.

plugins/elevenlabs/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the coordinated release.

plugins/silero/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the coordinated release.

plugins/cartesia/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 aligns with the coordinated release across the monorepo.

plugins/baseten/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the release.

plugins/google/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 aligns with the release and accompanying changelog entry documenting the realtime generation sequencing fix.

plugins/xai/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 is consistent with the coordinated release.

plugins/livekit/CHANGELOG.md (1)

3-8: LGTM!

Changelog entry for 1.0.38 follows the established format and properly documents the dependency update to @livekit/agents@1.0.38 with linked commit references.

plugins/resemble/CHANGELOG.md (1)

3-8: LGTM!

Changelog entry for 1.0.38 is properly formatted and documents the dependency update to @livekit/agents@1.0.38, consistent with other plugin changelog entries in this release.

plugins/openai/package.json (1)

3-3: Version bump looks consistent. Line 3 cleanly updates the package version to 1.0.38.

plugins/google/CHANGELOG.md (1)

3-11: Changelog entry is clear and release-ready. Lines 3-11 document the patch update and dependency bump concisely.

plugins/resemble/package.json (1)

3-3: Version bump looks consistent. Line 3 cleanly updates the package version to 1.0.38.

plugins/livekit/package.json (1)

3-3: Version bump looks consistent. Line 3 cleanly updates the package version to 1.0.38.

plugins/deepgram/package.json (1)

3-3: Version bump looks consistent. Line 3 cleanly updates the package version to 1.0.38.

plugins/neuphonic/package.json (1)

3-3: Version bump looks consistent. Line 3 cleanly updates the package version to 1.0.38.

plugins/anam/package.json (1)

3-3: Version bump looks good.
No concerns with the patch release update.

plugins/neuphonic/CHANGELOG.md (1)

3-9: Changelog entry is consistent.
The patch notes align with the dependency bump.

agents/CHANGELOG.md (1)

3-11: Changelog update looks good.
Patch notes are clear and well-scoped.

plugins/openai/CHANGELOG.md (1)

3-10: Changelog entry is aligned.
No issues with the patch notes or dependency update.

plugins/deepgram/CHANGELOG.md (1)

3-10: Changelog entry looks good.
No concerns with this patch update.

turbo.json (1)

56-56: Global env addition is fine.
No issues with adding LK_GOOGLE_DEBUG to the shared environment list.

plugins/elevenlabs/CHANGELOG.md (1)

3-8: LGTM!

Standard changeset-generated changelog entry for version 1.0.38 with dependency update to @livekit/agents@1.0.38. Format is consistent with previous entries.

plugins/test/CHANGELOG.md (1)

3-8: LGTM!

Changelog entry for version 1.0.38 follows the established pattern with updated dependency references.

plugins/test/package.json (1)

3-3: LGTM!

Version bump to 1.0.38 aligns with the changelog entry and the broader package release.

plugins/anam/CHANGELOG.md (1)

3-8: LGTM!

Changelog entry for version 1.0.38 with dependency update follows the established format.

agents/src/stream/stream_channel.ts (1)

13-13: Interface addition looks good.

The new addStreamInput method enables dynamically piping additional input streams into the channel, which supports the interruption detection integration mentioned in the PR objectives.

agents/src/inference/interruption/InterruptionStream.ts (1)

42-48: LGTM: Method rename improves clarity.

Renaming speechStarted() to agentSpeechStarted() and speechEnded() to agentSpeechEnded() better conveys the intent and aligns with the agent-facing interruption API.

plugins/google/src/beta/realtime/realtime_api.ts (4)

46-47: LGTM: Debug flag for verbose logging.

The LK_GOOGLE_DEBUG environment variable provides a clean way to enable verbose client/server logging without flooding production logs.

961-1000: Generation lifecycle refactoring looks correct.

The new logic properly handles:

Interrupted events without active generation

Clearing empty server content

Starting new generation based on isNewGeneration helper

The separation of concerns with isNewGeneration() improves readability.

1370-1376: Verify interaction between interrupted handling and pendingGenerationFut.

At line 1370, handleInputSpeechStarted() is called when serverContent.interrupted is true and there's no pendingGenerationFut. However, this same condition is also checked earlier at line 969-971 within shouldStartNewGeneration block. Confirm that calling handleInputSpeechStarted() twice in certain edge cases is intentional/idempotent.

1531-1545: LGTM: Clean helper method for generation determination.

The isNewGeneration method properly checks for toolCall or meaningful serverContent fields to determine if a new generation should start.

agents/src/voice/agent.ts (1)

195-201: LGTM: New getters for interruption configuration.

The interruptionDetection and allowInterruptions getters provide clean read-only access to the interruption configuration state.

agents/src/voice/audio_recognition.ts (2)

111-118: LGTM: Interruption detection state initialization.

The new fields properly track interruption detection state including the stream channel, buffer, and agent speaking status.

546-550: LGTM: Interruption event routing.

The onInterruptionEvent method correctly routes interruption events to the recognition hooks.

agents/src/voice/agent_activity.ts (4)

110-113: LGTM: Interruption detection state flags.

The new flags properly track interruption detection enablement and audio activity interruption state.

406-424: LGTM: Extended updateOptions with turnDetection.

The method signature is appropriately extended to accept turnDetection, and the default interruption behavior is updated accordingly.

753-759: LGTM: onInterruption handler properly chains behaviors.

The handler correctly restores audio activity interruption, triggers interruption, and notifies audio recognition.

2171-2175: LGTM: Agent state transition after tool execution.

The state properly transitions to 'thinking' when tool outputs exist, or 'listening' when speaking completes without tool calls.

.changeset/shy-rockets-fry.md (1)

1-5: LGTM: Changeset accurately describes the PR scope.

The patch version bump is appropriate for the state transition fixes and new interruption support features.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

coderabbitai · 2026-01-23T16:26:46Z

agents/src/inference/interruption/InterruptionStream.ts

  static overlapSpeechStarted(
-    speechDurationInS: number,
-    userSpeakingSpan: Span,
+    speechDurationInS?: number,
+    userSpeakingSpan?: Span,
  ): OverlapSpeechStarted {
    return { type: 'overlap-speech-started', speechDurationInS, userSpeakingSpan };
  }


⚠️ Potential issue | 🟠 Major

Potential NaN when speechDurationInS is undefined.

When speechDurationInS is undefined, the calculation at line 244 (Math.round(chunk.speechDurationInS * this.options.sampleRate)) will produce NaN, causing incorrect buffer handling.

🐛 Proposed fix

} else if (chunk.type === 'overlap-speech-started' && agentSpeechStarted) { this.userSpeakingSpan = chunk.userSpeakingSpan; this.logger.debug('overlap speech started, starting interruption inference'); overlapSpeechStarted = true; accumulatedSamples = 0; // Include both speech duration and audio prefix duration for context + const speechDuration = chunk.speechDurationInS ?? 0; const shiftSize = Math.min( startIdx, - Math.round(chunk.speechDurationInS * this.options.sampleRate) + + Math.round(speechDuration * this.options.sampleRate) + Math.round(this.options.audioPrefixDurationInS * this.options.sampleRate), );

🤖 Prompt for AI Agents

In `@agents/src/inference/interruption/InterruptionStream.ts` around lines 50 - 55, The overlap-speech-started factory may emit undefined speechDurationInS which later leads to NaN in the Math.round(chunk.speechDurationInS * this.options.sampleRate) calculation; update overlapSpeechStarted (and/or the consumer before multiplication) to ensure speechDurationInS is a finite number (e.g., coalesce undefined to 0 or validate with Number.isFinite) so the Math.round call always receives a numeric value; specifically, adjust the overlapSpeechStarted return value or the code that reads chunk.speechDurationInS so that chunk.speechDurationInS is a safe numeric value before calling Math.round in the consumer that references this.options.sampleRate.

coderabbitai · 2026-01-23T16:26:46Z

agents/src/inference/interruption/types.ts

 export interface OverlapSpeechStarted {
  type: 'overlap-speech-started';
-  speechDurationInS: number;
-  userSpeakingSpan: Span;
+  speechDurationInS?: number;
+  userSpeakingSpan?: Span;
 }


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for usages of OverlapSpeechStarted and check how speechDurationInS and userSpeakingSpan are accessed rg -n -A5 -B2 'OverlapSpeechStarted' --type=ts echo "---" # Check for direct property access that might fail with undefined rg -n '\.speechDurationInS|\.userSpeakingSpan' --type=ts -C3

Repository: livekit/agents-js

Length of output: 11029

🏁 Script executed:

#!/bin/bash # Check context around the problematic lines to confirm they lack null checks sed -n '240,250p' agents/src/inference/interruption/InterruptionStream.ts echo "---" sed -n '343,350p' agents/src/inference/interruption/InterruptionStream.ts

Repository: livekit/agents-js

Length of output: 1146

🏁 Script executed:

#!/bin/bash # Check if there are guards before lines 244 and 346 that ensure speechDurationInS is defined sed -n '236,246p' agents/src/inference/interruption/InterruptionStream.ts echo "---" # Check context for line 346 sed -n '340,350p' agents/src/inference/interruption/InterruptionStream.ts echo "---" # Search for any guards on overlapSpeechStarted or speechDurationInS rg -B10 'chunk.speechDurationInS' agents/src/inference/interruption/InterruptionStream.ts | head -30

Repository: livekit/agents-js

Length of output: 1774

Fix unsafe accesses to optional speechDurationInS field before using in calculations.

Making speechDurationInS and userSpeakingSpan optional in OverlapSpeechStarted introduces runtime bugs in the consumer code. Two places access speechDurationInS without null checks:

Line 244 in InterruptionStream.ts: Math.round(chunk.speechDurationInS * this.options.sampleRate) will return NaN if speechDurationInS is undefined

Line 346 in InterruptionStream.ts: Date.now() - frame.speechDurationInS * 1000 will return NaN if speechDurationInS is undefined

Both branches only check the message type ('overlap-speech-started') but not whether the optional field is defined. Add null checks before arithmetic operations or provide default values.

🤖 Prompt for AI Agents

In `@agents/src/inference/interruption/types.ts` around lines 67 - 71, The optional fields on OverlapSpeechStarted (speechDurationInS and userSpeakingSpan) are used without guards in InterruptionStream.ts causing NaN in calculations; update the places that compute Math.round(chunk.speechDurationInS * this.options.sampleRate) and Date.now() - frame.speechDurationInS * 1000 to first check that chunk.speechDurationInS / frame.speechDurationInS is not null/undefined (or coerce to a safe default like 0) and handle the absent-case (e.g., early-return, skip the branch, or use 0 as a fallback) so arithmetic only runs on defined numeric values; locate references to OverlapSpeechStarted, chunk.speechDurationInS, frame.speechDurationInS and this.options.sampleRate to apply the guard.

coderabbitai · 2026-01-23T16:26:46Z

agents/src/stream/stream_channel.ts

+    addStreamInput: (newInputStream) => {
+      const reader = newInputStream.getReader();
+      (async () => {
+        try {
+          while (true) {
+            const { done, value } = await reader.read();
+            if (done) break;
+            await writer.write(value);
+          }
+        } finally {
+          reader.releaseLock();
+        }
+      })();
+    },


⚠️ Potential issue | 🟠 Major

Add error handling for the async read loop.

The fire-and-forget async IIFE silently swallows errors. If writer.write() fails (e.g., channel aborted or closed), there's no error handling, which could lead to unhandled promise rejections or silent failures.

Additionally, there's no check for isClosed before writing, so attempts to write to a closed channel will throw.

🐛 Proposed fix with error handling

addStreamInput: (newInputStream) => { const reader = newInputStream.getReader(); (async () => { try { while (true) { + if (isClosed) break; const { done, value } = await reader.read(); if (done) break; await writer.write(value); } + } catch (e) { + // Stream was likely closed or aborted; suppress expected errors + if (!(e instanceof Error && e.name === 'TypeError')) { + console.error('Error in addStreamInput:', e); + } } finally { reader.releaseLock(); } })(); },

🤖 Prompt for AI Agents

In `@agents/src/stream/stream_channel.ts` around lines 28 - 41, The addStreamInput async IIFE currently swallows errors and writes without checking channel state; update the loop in addStreamInput (the async IIFE using newInputStream.getReader() and await writer.write(value)) to guard writes with a channel-closed check (e.g., if (writer.isClosed || isClosed) break/skip) and wrap the read/write body in a try/catch so any error from writer.write(...) or reader.read() is handled: on error release the reader, call writer.abort(error) or writer.close() as appropriate, and surface/log the error to avoid unhandled promise rejections; keep the existing finally block that calls reader.releaseLock() so the reader is always released.

coderabbitai · 2026-01-23T16:26:46Z

agents/src/voice/agent_activity.ts

+      if (this.isInterruptionDetectionEnabled && this.audioRecognition) {
+        this.audioRecognition.OnEndOfAgentSpeech(Date.now());
+      }
+      this.restoreInterruptionByAudioActivity();
    }


⚠️ Potential issue | 🔴 Critical

Typo in method call: OnEndOfAgentSpeech should be onEndOfAgentSpeech.

The method is called with a capital 'O' which will cause a runtime error since JavaScript is case-sensitive.

🐛 Proposed fix

this.agentSession._updateAgentState('listening'); if (this.isInterruptionDetectionEnabled && this.audioRecognition) { - this.audioRecognition.OnEndOfAgentSpeech(Date.now()); + this.audioRecognition.onEndOfAgentSpeech(Date.now()); } this.restoreInterruptionByAudioActivity();

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (this.isInterruptionDetectionEnabled && this.audioRecognition) {

this.audioRecognition.OnEndOfAgentSpeech(Date.now());

}

this.restoreInterruptionByAudioActivity();

}

if (this.isInterruptionDetectionEnabled && this.audioRecognition) {

this.audioRecognition.onEndOfAgentSpeech(Date.now());

}

this.restoreInterruptionByAudioActivity();

🤖 Prompt for AI Agents

In `@agents/src/voice/agent_activity.ts` around lines 1374 - 1378, The call to audioRecognition.OnEndOfAgentSpeech is using the wrong case and will throw at runtime; change the invocation to audioRecognition.onEndOfAgentSpeech(Date.now()) where the audioRecognition object is used (within the block guarded by isInterruptionDetectionEnabled) and keep the surrounding logic (including restoreInterruptionByAudioActivity) intact so the only change is the method name casing.

coderabbitai · 2026-01-23T16:26:46Z

agents/src/voice/agent_activity.ts

+      if (this.isInterruptionDetectionEnabled && this.audioRecognition) {
+        {
+          this.audioRecognition.onEndOfAgentSpeech({ ignoreUserTranscriptUntil: Date.now() });
+          this.restoreInterruptionByAudioActivity();
+        }
+      }


⚠️ Potential issue | 🔴 Critical

Inconsistent method signature: passes object instead of number.

At line 1675, onEndOfAgentSpeech is called with an object { ignoreUserTranscriptUntil: Date.now() }, but the method signature in audio_recognition.ts (line 181) expects a number: onEndOfAgentSpeech(ignoreUserTranscriptUntil: number).

🐛 Proposed fix

if (this.isInterruptionDetectionEnabled && this.audioRecognition) { - { - this.audioRecognition.onEndOfAgentSpeech({ ignoreUserTranscriptUntil: Date.now() }); - this.restoreInterruptionByAudioActivity(); - } + this.audioRecognition.onEndOfAgentSpeech(Date.now()); + this.restoreInterruptionByAudioActivity(); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (this.isInterruptionDetectionEnabled && this.audioRecognition) {

{

this.audioRecognition.onEndOfAgentSpeech({ ignoreUserTranscriptUntil: Date.now() });

this.restoreInterruptionByAudioActivity();

}

}

if (this.isInterruptionDetectionEnabled && this.audioRecognition) {

this.audioRecognition.onEndOfAgentSpeech(Date.now());

this.restoreInterruptionByAudioActivity();

}

🤖 Prompt for AI Agents

In `@agents/src/voice/agent_activity.ts` around lines 1673 - 1678, The call to audioRecognition.onEndOfAgentSpeech uses an object but the method signature expects a number; change the invocation inside the isInterruptionDetectionEnabled block so that onEndOfAgentSpeech receives a numeric timestamp (e.g., Date.now()) rather than an object, and keep the subsequent call to restoreInterruptionByAudioActivity unchanged; update the call site where audioRecognition.onEndOfAgentSpeech is invoked to pass the number directly.

coderabbitai · 2026-01-23T16:26:47Z

agents/src/voice/agent.ts

+    this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset
+
+    this.turnDetection = this.turnHandling?.turnDetection;


⚠️ Potential issue | 🟠 Major

Legacy turnDetection option is ignored.

The turnDetection parameter is destructured from options but never used. Line 130 only reads from turnHandling?.turnDetection, so the deprecated turnDetection option has no effect.

🐛 Proposed fix to migrate legacy option

this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset - this.turnDetection = this.turnHandling?.turnDetection; + this.turnDetection = this.turnHandling?.turnDetection ?? turnDetection;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset

this.turnDetection = this.turnHandling?.turnDetection;

this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset

this.turnDetection = this.turnHandling?.turnDetection ?? turnDetection;

🤖 Prompt for AI Agents

In `@agents/src/voice/agent.ts` around lines 128 - 130, The code ignores the legacy turnDetection option because you always read from this.turnHandling; fix by preferring an explicit legacy option when present: in the constructor where this.turnHandling and this.turnDetection are set, check the incoming options.turnDetection (or the destructured turnDetection param) and assign this.turnDetection = options.turnDetection ?? this.turnHandling?.turnDetection (or if you have a turnConfig sentinel, only migrate legacy when turnConfig is unset), ensuring the legacy value is applied when provided; update the assignment site around this.turnHandling / this.turnDetection to perform this conditional migration.

coderabbitai · 2026-01-23T16:26:47Z

agents/src/voice/agent.ts

+    this._interruptionDetection = this.turnHandling?.interruption.mode;
+    if (this.turnHandling?.interruption.mode !== undefined) {
+      this._allowInterruptions = !!this.turnHandling.interruption.mode;
+    }


⚠️ Potential issue | 🔴 Critical

Potential runtime error when turnHandling exists but interruption is undefined.

Accessing turnHandling?.interruption.mode will throw a TypeError if turnHandling is defined but interruption is not. Use optional chaining on interruption as well.

🐛 Proposed fix

- this._interruptionDetection = this.turnHandling?.interruption.mode; - if (this.turnHandling?.interruption.mode !== undefined) { - this._allowInterruptions = !!this.turnHandling.interruption.mode; + this._interruptionDetection = this.turnHandling?.interruption?.mode; + if (this.turnHandling?.interruption?.mode !== undefined) { + this._allowInterruptions = !!this.turnHandling.interruption.mode; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this._interruptionDetection = this.turnHandling?.interruption.mode;

if (this.turnHandling?.interruption.mode !== undefined) {

this._allowInterruptions = !!this.turnHandling.interruption.mode;

}

this._interruptionDetection = this.turnHandling?.interruption?.mode;

if (this.turnHandling?.interruption?.mode !== undefined) {

this._allowInterruptions = !!this.turnHandling.interruption?.mode;

}

🤖 Prompt for AI Agents

In `@agents/src/voice/agent.ts` around lines 151 - 154, The code risks a TypeError when turnHandling exists but interruption is undefined; change references to use optional chaining on interruption (e.g., read this.turnHandling?.interruption?.mode into this._interruptionDetection) and update the check/assignment for this._allowInterruptions to use the same safe chain (e.g., set this._allowInterruptions = !!this.turnHandling?.interruption?.mode) so you never access interruption.mode directly; update both occurrences (the assignment to this._interruptionDetection and the conditional that sets this._allowInterruptions) to use interruption?.

coderabbitai · 2026-01-23T16:26:47Z

agents/src/voice/audio_recognition.ts

+  async onEndOfOverlapSpeech(userSpeakingSpan?: Span) {
+    if (this.isInterruptionEnabled) {
+      return;
+    }
+    if (userSpeakingSpan && userSpeakingSpan.isRecording()) {
+      userSpeakingSpan.setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false');
+    }
+
+    return this.trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded());
+  }


⚠️ Potential issue | 🔴 Critical

Logic appears inverted: returns early when interruption IS enabled.

onEndOfOverlapSpeech returns early (line 220) when isInterruptionEnabled is true, meaning the sentinel is only sent when interruption is disabled. This seems backwards—the sentinel should be sent when interruption detection is enabled.

🐛 Proposed fix

async onEndOfOverlapSpeech(userSpeakingSpan?: Span) { - if (this.isInterruptionEnabled) { + if (!this.isInterruptionEnabled) { return; } if (userSpeakingSpan && userSpeakingSpan.isRecording()) { userSpeakingSpan.setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false'); } return this.trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded()); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

async onEndOfOverlapSpeech(userSpeakingSpan?: Span) {

if (this.isInterruptionEnabled) {

return;

}

if (userSpeakingSpan && userSpeakingSpan.isRecording()) {

userSpeakingSpan.setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false');

}

return this.trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded());

}

async onEndOfOverlapSpeech(userSpeakingSpan?: Span) {

if (!this.isInterruptionEnabled) {

return;

}

if (userSpeakingSpan && userSpeakingSpan.isRecording()) {

userSpeakingSpan.setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false');

}

return this.trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded());

}

🤖 Prompt for AI Agents

In `@agents/src/voice/audio_recognition.ts` around lines 218 - 227, The guard in onEndOfOverlapSpeech is inverted: change the early return so the function only bails out when interruption detection is disabled (i.e. replace the current if (this.isInterruptionEnabled) return; with if (!this.isInterruptionEnabled) return;), so that trySendInterruptionSentinel(InterruptionStreamSentinel.overlapSpeechEnded()) is invoked when interruptions are enabled; keep the existing user speaking span handling (userSpeakingSpan.isRecording() / setAttribute(traceTypes.ATTR_IS_INTERRUPTION, 'false')) as-is so it still runs before sending the sentinel.

coderabbitai · 2026-01-23T16:26:47Z

agents/src/voice/audio_recognition.ts

+    const eventsToEmit =
+      emitFromIndex && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : [];


⚠️ Potential issue | 🟠 Major

Index 0 is falsy: emitFromIndex && shouldFlush fails when emitFromIndex === 0.

When emitFromIndex is 0, the condition emitFromIndex && shouldFlush evaluates to false, incorrectly skipping events that should be emitted.

🐛 Proposed fix

const eventsToEmit = - emitFromIndex && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : []; + emitFromIndex !== null && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : [];

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const eventsToEmit =

emitFromIndex && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : [];

const eventsToEmit =

emitFromIndex !== null && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : [];

🤖 Prompt for AI Agents

In `@agents/src/voice/audio_recognition.ts` around lines 279 - 280, The current condition for building eventsToEmit incorrectly treats emitFromIndex === 0 as falsy; update the check in audio_recognition.ts so eventsToEmit is populated when emitFromIndex is a valid numeric index (including 0) and shouldFlush is true — e.g., replace the truthy check with an explicit numeric/index validation (like typeof emitFromIndex === 'number' && emitFromIndex >= 0 && shouldFlush) before calling this.transcriptBuffer.slice(emitFromIndex).

coderabbitai · 2026-01-23T16:26:47Z

agents/src/voice/audio_recognition.ts

+      } else if (this.transcriptBuffer) {
+        await this.flushHeldTranscripts();


⚠️ Potential issue | 🟡 Minor

Condition always truthy: this.transcriptBuffer is an array.

The check else if (this.transcriptBuffer) is always truthy since transcriptBuffer is initialized as an array. This should check transcriptBuffer.length > 0.

🐛 Proposed fix

- } else if (this.transcriptBuffer) { + } else if (this.transcriptBuffer.length > 0) { await this.flushHeldTranscripts(); // no return here to allow the new event to be processed normally }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} else if (this.transcriptBuffer) {

await this.flushHeldTranscripts();

} else if (this.transcriptBuffer.length > 0) {

await this.flushHeldTranscripts();

🤖 Prompt for AI Agents

In `@agents/src/voice/audio_recognition.ts` around lines 370 - 371, The condition else if (this.transcriptBuffer) is always truthy because transcriptBuffer is an array; change the branch to check for content by using this.transcriptBuffer.length > 0 (or a safe check like Array.isArray(this.transcriptBuffer) && this.transcriptBuffer.length > 0) before calling await this.flushHeldTranscripts(); update the condition in the method that contains transcriptBuffer and flushHeldTranscripts to use this length check so flushHeldTranscripts() only runs when there are held transcripts.

wip agent activity interruption detector integration

2931901

lukasIO and others added 9 commits January 22, 2026 16:34

wip audio recognition

ff58073

fix: Fix Google realtime generation lifecycle to fix "generateReply t…

64deb00

…imed out waiting for generation_created" event (parity with Python plugin) (#989)

Version Packages (#987)

3e2fbb2

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Agent state fixes and interim transcript interruption support (#992)

15dd71b

Merge branch 'main' into lukas/interruption-integration

078fdbd

more audio recognition updates

149e68f

add stream input

6fa0b88

finish audio recognition

6e91313

fix callers

254cc7f

lukasIO marked this pull request as ready for review January 23, 2026 16:21

coderabbitai bot reviewed Jan 23, 2026

View reviewed changes

fix typo

157fcf5

		this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset

		this.turnDetection = this.turnHandling?.turnDetection;

		const eventsToEmit =
		emitFromIndex && shouldFlush ? this.transcriptBuffer.slice(emitFromIndex) : [];

		} else if (this.transcriptBuffer) {
		await this.flushHeldTranscripts();

wip agent activity interruption detector integration #991

Are you sure you want to change the base?

wip agent activity interruption detector integration #991

Uh oh!

Conversation

lukasIO commented Jan 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes Made

Pre-Review Checklist

Testing

Additional Notes

Summary by CodeRabbit

Release Notes v1.0.38

Uh oh!

changeset-bot bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

coderabbitai bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

lukasIO commented Jan 23, 2026

Uh oh!

coderabbitai bot commented Jan 23, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lukasIO commented Jan 22, 2026 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Jan 22, 2026 •

edited

Loading

coderabbitai bot commented Jan 22, 2026 •

edited

Loading