Skip to content

Feat/barge in#1002

Open
lukasIO wants to merge 36 commits intomainfrom
feat/barge-in
Open

Feat/barge in#1002
lukasIO wants to merge 36 commits intomainfrom
feat/barge-in

Conversation

@lukasIO
Copy link
Contributor

@lukasIO lukasIO commented Jan 29, 2026

Description

Adds improved interruption handling capabilities.

  • new TurnHandling config
  • support for barge-in detection

TODO

  • ensure migrated options are exposed consistently and redundant fields are being deprecated

Summary by CodeRabbit

  • New Features

    • Adaptive interruption (real-time barge‑in) with HTTP and WebSocket transports, new interruption stream/streaming events, and session/agent turn-handling configuration.
    • Audio recognition and session APIs expose interruption hooks and emit user interruption/non‑interruption events.
  • Improvements

    • Stream channel supports aborting and adding multiple input streams.
    • Telemetry now includes interruption attributes (probability, durations, delays).
  • Tests

    • New unit tests covering interruption utilities and migration logic; example updated to log interruption events.
  • Chores

    • CI test examples step commented out; a runtime dependency added.

@changeset-bot
Copy link

changeset-bot bot commented Jan 29, 2026

🦋 Changeset detected

Latest commit: 76bd4e8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 19 packages
Name Type
@livekit/agents Major
@livekit/agents-plugin-anam Major
@livekit/agents-plugin-baseten Major
@livekit/agents-plugin-bey Major
@livekit/agents-plugin-cartesia Major
@livekit/agents-plugin-deepgram Major
@livekit/agents-plugin-elevenlabs Major
@livekit/agents-plugin-google Major
@livekit/agents-plugin-hedra Major
@livekit/agents-plugin-inworld Major
@livekit/agents-plugin-lemonslice Major
@livekit/agents-plugin-livekit Major
@livekit/agents-plugin-neuphonic Major
@livekit/agents-plugin-openai Major
@livekit/agents-plugin-resemble Major
@livekit/agents-plugin-rime Major
@livekit/agents-plugin-silero Major
@livekit/agents-plugins-test Major
@livekit/agents-plugin-xai Major

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link

coderabbitai bot commented Jan 29, 2026

📝 Walkthrough

Walkthrough

Adds a new adaptive interruption-detection subsystem (types, defaults, utils, cache, HTTP/WS transports, detector and stream pipeline), integrates it into agent/session/activity/audio-recognition with turn-handling config and migration utilities, updates CI, and adds a runtime dependency.

Changes

Cohort / File(s) Summary
Build & CI
\.changeset/config.json, .github/workflows/test.yml
Reformatted .changeset/config.json fixed field to an inline array; commented out the "Test examples" step in CI (disables pnpm test:examples).
Dependencies
agents/package.json
Added runtime dependency ofetch (^1.5.1).
Interruption Types & Defaults
agents/src/inference/interruption/types.ts, agents/src/inference/interruption/defaults.ts, agents/src/inference/interruption/errors.ts
New interruption types/sentinels, runtime constants/env-driven defaults, exponential-backoff helper, getDefaultInferenceUrl, and typed InterruptionDetectionError.
Cache & Utilities
agents/src/inference/interruption/interruption_cache_entry.ts, agents/src/inference/interruption/utils.ts, agents/src/inference/interruption/utils.test.ts
Added InterruptionCacheEntry, BoundedCache, probability estimation/sliding-window logic, and tests for sliding-window behavior.
Transports
agents/src/inference/interruption/http_transport.ts, agents/src/inference/interruption/ws_transport.ts
Implemented HTTP and WebSocket TransformStream transports: auth, retry/backoff, framing/handshake, response parsing, event construction, caching, reconnect support.
Detection Engine & Stream
agents/src/inference/interruption/interruption_detector.ts, agents/src/inference/interruption/interruption_stream.ts
Added AdaptiveInterruptionDetector (manager/options propagation) and InterruptionStreamBase (audio transform, resampling/downmix, buffering, transport selection, state management, lifecycle methods, event emission).
Stream Channel & Telemetry
agents/src/stream/stream_channel.ts, agents/src/telemetry/trace_types.ts, agents/src/voice/events.ts
Extended StreamChannel API (abort, addStreamInput); added interruption telemetry attributes; added UserInterruptionDetected and UserNonInterruptionDetected session events.
Turn Handling Config & Migration
agents/src/voice/turn_config/*, agents/src/voice/turn_config/utils.test.ts, agents/src/voice/turn_config/utils.ts
Introduced endpointing and interruption config types, TurnHandlingConfig, defaults, and migrateLegacyOptions with tests to map legacy flat VoiceOptions into nested SessionOptions.
Voice Integration
agents/src/voice/agent.ts, agents/src/voice/agent_activity.ts, agents/src/voice/agent_session.ts, agents/src/voice/audio_recognition.ts, agents/src/voice/agent.test.ts
Wired interruption config into Agent/Session/Activity/AudioRecognition: new getters, session migration to SessionOptions, detector lifecycle integration, interruption-aware STT buffering/EOU handling, hooks/events, and test logger init.
Examples & Changeset
examples/src/basic_agent.ts, .changeset/silly-donkeys-shop.md
Updated example to use options.turnHandling with adaptive interruption; added changeset noting turn-handling refactor and barge-in support.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(220, 235, 255, 0.5)
    participant Agent as Agent / AgentActivity
    participant AR as AudioRecognition
    participant AD as AdaptiveInterruptionDetector
    participant Stream as InterruptionStream
    participant Transport as HTTP/WS Transport
    participant Remote as Remote Inference
    end

    Agent->>AR: init(interruptionDetector)
    AR->>AD: createStream()
    AD->>Stream: instantiate(stream, options)

    Agent->>AR: signal overlapSpeechStarted
    AR->>Stream: pushFrame(overlapSpeechStarted)
    loop audio frames
      Agent->>AR: push audioFrame
      AR->>Stream: pushFrame(audioFrame)
    end
    Stream->>Transport: send audio chunk when minFrames reached
    Transport->>Remote: POST/WS with auth + audio
    Remote-->>Transport: prediction (isBargein, probs, createdAt)
    Transport->>Stream: emit InterruptionEvent / update cache

    alt interruption detected
        Stream-->>AR: InterruptionEvent
        AR->>Agent: onInterruption(ev)
    else no interruption
        Stream-->>Stream: continue buffering
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • toubatbrian
  • chenghao-mou
  • rektdeckard

Poem

🐰 I hopped through buffers and frames,

I chased the beats and learned their names,
WS and HTTP sent back their clues,
Turns now pause when the detector sues,
Hooray — the rabbit found the cues!

🚥 Pre-merge checks | ❌ 3
❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is incomplete. While it mentions the two main additions (TurnHandling config and barge-in support), it lacks detailed explanation of changes, testing information, and most checklist items are unchecked or incomplete. Complete the description with detailed change explanations, testing results, and a filled-out pre-review checklist. Address the TODO item about exposing migrated options consistently and deprecating redundant fields.
Docstring Coverage ⚠️ Warning Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Feat/barge in' is vague and uses a conventional prefix without clearly describing the main feature. While it references barge-in, it lacks clarity about what capability is being added. Improve the title to be more descriptive, e.g., 'Add adaptive interruption detection and barge-in support' or 'Introduce TurnHandling config with interruption capabilities'.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/barge-in

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lukasIO lukasIO requested a review from chenghao-mou January 29, 2026 10:32
@lukasIO lukasIO requested a review from toubatbrian January 30, 2026 14:30
@lukasIO lukasIO marked this pull request as ready for review January 30, 2026 14:30
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agents/src/voice/agent_activity.ts (1)

914-924: ⚠️ Potential issue | 🟡 Minor

Use the new turnHandling.interruption.minWords in logs to avoid undefined values.

The log currently reads the deprecated options.minInterruptionWords. It should reflect the new config path.

🛠️ Suggested fix
         this.logger.info(
           {
             wordCount,
-            minInterruptionWords: this.agentSession.options.minInterruptionWords,
+            minInterruptionWords: this.agentSession.options.turnHandling.interruption?.minWords,
           },
           'skipping user input, word count below minimum interruption threshold',
         );
🤖 Fix all issues with AI agents
In `@agents/src/inference/interruption/interruption_stream.ts`:
- Around line 372-374: The close() implementation on InterruptionStreamBase
currently only closes inputStream and never removes the stream from
AdaptiveInterruptionDetector's Set, causing leaks and stale updateOptions()
calls; modify the close() method (the async close(): Promise<void> in
interruption_stream.ts) to also unregister/remove this stream from the detector
(e.g., call a detector.unregister/removeStream method or delete it from the
detector's internal Set such as detector.streams.delete(this)) after closing the
inputStream so closed streams are no longer tracked or receive updateOptions().

In `@agents/src/inference/interruption/ws_transport.ts`:
- Around line 180-191: Cache lookups fail and durations are wrong because
entries are keyed with performance.now() but looked up with message.created_at
and durations mix performance.now() with Date-based timestamps; fix by switching
to a single time base (use millisecond Unix time via
Date.now()/message.created_at) for cache keys and all timing math: when
constructing InterruptionCacheEntry (and when populating state.cache), compute
const eventTsMs = message.created_at ?? Date.now(), use
state.cache.get(eventTsMs) / state.cache.set(eventTsMs, ...) and compute
totalDurationInS as (Date.now() - eventTsMs)/1000 while keeping
detectionDelayInS as (Date.now() - state.overlapSpeechStartedAt)/1000; apply the
same change where similar logic exists (the other InterruptionCacheEntry
creation blocks around the 228-239 and 268-272 regions).

In `@agents/src/stream/stream_channel.ts`:
- Around line 24-40: The addStreamInput async IIFE must observe and handle
errors and abort() must be idempotent: inside addStreamInput (function
addStreamInput / the async IIFE using reader.read() and writer.write()) wrap the
loop in try/catch and on any error call the existing abort(error) (or
writer.abort(error) via abort()) so rejections are observed and optionally
surface/log the error, keeping reader.releaseLock() in the finally; also ensure
the IIFE returns its promise (or at least attach .catch(...) to it) to avoid
unhandled rejections. For abort() (the abort: (error: E) => { ... } function)
add an isClosed check/guard like close() uses so calling abort() multiple times
is a no-op (return a resolved promise) and only calls writer.abort(error) once
while setting isClosed = true.

In `@agents/src/voice/audio_recognition.ts`:
- Around line 254-287: The current gating for building eventsToEmit treats
emitFromIndex === 0 as falsy and also compares shouldFlush to undefined; update
the condition so it explicitly checks emitFromIndex for null and that
shouldFlush is true. Replace the final expression that computes eventsToEmit to
use an explicit null check (e.g., emitFromIndex !== null) and a boolean check of
shouldFlush (e.g., shouldFlush === true or simply shouldFlush) so that slice(0)
is allowed and events from index 0 are not dropped; keep using
this.transcriptBuffer.slice(emitFromIndex) when the checks pass.
- Around line 302-329: The function shouldHoldSttEvent never initializes
this.inputStartedAt, so timestamp-based holding/clearing never works; when you
detect the first STT event with valid timestamps (e.g., ev.alternatives[0]
exists, startTime !== endTime and endTime > 0) set this.inputStartedAt to a
reference epoch that aligns the event's endTime with now (for example:
this.inputStartedAt = Date.now() - alternative.endTime * 1000 if endTime is in
seconds, or use Date.now() / 1000 - alternative.endTime if you keep seconds) so
subsequent comparisons in shouldHoldSttEvent and flushHeldTranscripts use the
same timebase; place this initialization inside shouldHoldSttEvent before any
timestamp comparisons when inputStartedAt is undefined.

In `@agents/src/voice/events.ts`:
- Around line 28-29: The file is missing the InterruptionEvent type integration:
import or define InterruptionEvent, add factory helpers
createUserInterruptionDetectedEvent and createUserNonInterruptionDetectedEvent
mirroring the pattern of existing event creators, and include the two new event
types (UserInterruptionDetected and UserNonInterruptionDetected) in the
AgentEvent union so the public AgentSession API payloads are properly typed;
reference the existing event creator functions and AgentEvent union to match
shape and typing conventions.

In `@examples/src/basic_agent.ts`:
- Around line 61-69: The turnHandling block is currently passed at the root of
the AgentSession config but must be nested inside the options property; update
the AgentSession constructor call to move the turnHandling object (including
turnDetection: new livekit.turnDetector.MultilingualModel(), interruption, and
preemptiveGeneration) under options so it matches the AgentSessionOptions /
SessionOptions shape, leaving voiceOptions and other root-level keys unchanged
(e.g., ensure AgentSession({... options: { turnHandling: { ... } },
voiceOptions: { ... } })).
🟡 Minor comments (14)
agents/src/voice/turn_config/endpointing.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

Fix SPDX header year to match repo guidelines.

Line 1 uses 2026; new TS files should use 2025 in the SPDX-FileCopyrightText line.

🔧 Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.

As per coding guidelines, "Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'".

.github/workflows/test.yml-49-53 (1)

49-53: ⚠️ Potential issue | 🟡 Minor

Example tests are now skipped in CI.

Lines 49-53 comment out pnpm test:examples, so example regressions won’t be caught on PRs. If this is temporary, please add a tracking issue or move it to a secrets-enabled workflow/scheduled job.

agents/src/inference/interruption/utils.test.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: SPDX headers should use 2025.

agents/src/voice/turn_config/interruption.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: SPDX headers should use 2025.

agents/src/voice/turn_config/turn_handling.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: SPDX headers should use 2025.

agents/src/inference/interruption/types.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

The SPDX header specifies 2026, but the coding guidelines require 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: "Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.'"

agents/src/inference/interruption/errors.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: SPDX headers should use 2025.

agents/src/voice/turn_config/utils.test.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX copyright year should be 2025.

Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0

As per coding guidelines: SPDX headers should use 2025.

agents/src/voice/agent.ts-102-104 (1)

102-104: ⚠️ Potential issue | 🟡 Minor

Legacy turnDetection option is not honored in constructor.

The deprecated turnDetection option from AgentOptions is not destructured or used in the constructor. While it's marked deprecated, it should still work for backward compatibility. Currently, only turnHandling.turnDetection is used.

🐛 Proposed fix for backward compatibility
   constructor(options: AgentOptions<UserData>) {
-    const { id, instructions, chatCtx, tools, stt, vad, llm, tts, turnHandling } = options;
+    const { id, instructions, chatCtx, tools, stt, vad, llm, tts, turnHandling, turnDetection: legacyTurnDetection, allowInterruptions: legacyAllowInterruptions } = options;
     // ... existing code ...
 
     this.turnHandling = turnHandling; // TODO migrate legacy options to new turn handling config when turnConfig is unset
 
-    this.turnDetection = this.turnHandling?.turnDetection;
+    this.turnDetection = this.turnHandling?.turnDetection ?? legacyTurnDetection;
     // ... existing code ...
 
     this._interruptionDetection = this.turnHandling?.interruption.mode;
     if (this.turnHandling?.interruption.mode !== undefined) {
       this._allowInterruptions = !!this.turnHandling.interruption.mode;
+    } else if (legacyAllowInterruptions !== undefined) {
+      this._allowInterruptions = legacyAllowInterruptions;
     }

Also applies to: 127-129

agents/src/voice/turn_config/utils.ts-28-45 (1)

28-45: ⚠️ Potential issue | 🟡 Minor

Mutation after as const may cause TypeScript errors.

The object turnHandling is declared with as const on line 41, making it deeply readonly. However, line 44 attempts to mutate turnHandling.interruption.mode. This will cause a TypeScript compilation error.

🐛 Proposed fix: Remove `as const` or restructure
   const turnHandling: TurnHandlingConfig = {
     turnDetection: turnDetection,
     interruption: {
       discardAudioIfUninterruptible: mergedOptions?.discardAudioIfUninterruptible,
       minDuration: mergedOptions?.minInterruptionDuration,
       minWords: mergedOptions?.minInterruptionWords,
     },
     endpointing: {
       minDelay: mergedOptions?.minEndpointingDelay,
       maxDelay: mergedOptions?.maxEndpointingDelay,
     },

     ...mergedOptions.turnHandling,
-  } as const;
+  };

   if (mergedOptions?.allowInterruptions === false) {
     turnHandling.interruption.mode = false;
   }
agents/src/voice/agent.ts-150-153 (1)

150-153: ⚠️ Potential issue | 🟡 Minor

Potential undefined access on interruption.mode.

turnHandling.interruption is typed as Partial<InterruptionConfig>, which means accessing .mode is safe (returns undefined), but the current logic assumes interruption itself exists. If turnHandling is provided without an interruption field, this.turnHandling?.interruption.mode would throw.

🛡️ Safer access pattern
-    this._interruptionDetection = this.turnHandling?.interruption.mode;
-    if (this.turnHandling?.interruption.mode !== undefined) {
+    this._interruptionDetection = this.turnHandling?.interruption?.mode;
+    if (this.turnHandling?.interruption?.mode !== undefined) {
       this._allowInterruptions = !!this.turnHandling.interruption.mode;
     }
agents/src/inference/interruption/interruption_stream.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX header year should be 2025 for new files.

Update the year to match the required header format.

🔧 Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.

As per coding guidelines: **/*.{ts,tsx,js,jsx}: Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'.

agents/src/inference/interruption/http_transport.ts-1-3 (1)

1-3: ⚠️ Potential issue | 🟡 Minor

SPDX header year should be 2025 for new files.

Update the copyright year to match the required header format.

🔧 Suggested fix
-// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2025 LiveKit, Inc.

As per coding guidelines: **/*.{ts,tsx,js,jsx}: Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'.

agents/src/inference/interruption/http_transport.ts-42-77 (1)

42-77: ⚠️ Potential issue | 🟡 Minor

Clarify timestamp semantics to prevent time-base mismatch in totalDurationInS calculation.

The code sends a local performance.now() timestamp to the server as created_at, then uses the server's returned created_at value to calculate totalDurationInS in the caller (line 143: (performance.now() - createdAt) / 1000). This assumes the server echoes back a value in the same time base, but the /bargein endpoint contract doesn't guarantee this. If the server returns a different timestamp (server time, epoch, etc.), the duration calculation becomes invalid.

Rename the local request timestamp to startedAt to clarify that it measures request duration. Separately, the returned createdAt field should be documented or renamed to clarify it represents the server's received timestamp, not the duration reference point.

🛠️ Suggested fix
 export async function predictHTTP(
   data: Int16Array,
   predictOptions: PredictOptions,
   options: PostOptions,
 ): Promise<PredictResponse> {
-  const createdAt = performance.now();
+  const startedAt = performance.now();
   const url = new URL(`/bargein`, options.baseUrl);
   url.searchParams.append('threshold', predictOptions.threshold.toString());
   url.searchParams.append('min_frames', predictOptions.minFrames.toFixed());
-  url.searchParams.append('created_at', createdAt.toFixed());
+  url.searchParams.append('created_at', startedAt.toFixed());

   let retryCount = 0;
   const response = await ofetch(url.toString(), {
@@
   const { created_at, is_bargein, probabilities } = predictEndpointResponseSchema.parse(response);

   return {
     createdAt: created_at,
     isBargein: is_bargein,
     probabilities,
-    predictionDurationInS: (performance.now() - createdAt) / 1000,
+    predictionDurationInS: (performance.now() - startedAt) / 1000,
   };
 }
🧹 Nitpick comments (9)
examples/src/basic_agent.ts (1)

72-75: Potential duplication: preemptiveGeneration is set in both turnHandling and voiceOptions.

preemptiveGeneration: true is set at line 68 inside turnHandling and again at line 74 inside voiceOptions. Once the type issue is resolved, consider whether both locations are necessary or if one should take precedence.

agents/src/inference/interruption/defaults.ts (1)

39-52: Verify consistency between inferenceTimeout and REMOTE_INFERENCE_TIMEOUT_IN_S.

inferenceTimeout is set to 1_000 (milliseconds) while REMOTE_INFERENCE_TIMEOUT_IN_S is 1.0 (seconds). While they represent the same duration, consider using REMOTE_INFERENCE_TIMEOUT_IN_S * 1000 for clarity and to ensure they stay in sync if the constant changes.

🔧 Suggested improvement for consistency
 export const interruptionOptionDefaults: InterruptionOptions = {
   sampleRate: SAMPLE_RATE,
   threshold: THRESHOLD,
   minFrames: Math.ceil(MIN_INTERRUPTION_DURATION_IN_S * FRAMES_PER_SECOND),
   maxAudioDurationInS: MAX_AUDIO_DURATION_IN_S,
   audioPrefixDurationInS: AUDIO_PREFIX_DURATION_IN_S,
   detectionIntervalInS: DETECTION_INTERVAL_IN_S,
-  inferenceTimeout: 1_000,
+  inferenceTimeout: REMOTE_INFERENCE_TIMEOUT_IN_S * 1000,
   baseUrl: DEFAULT_BASE_URL,
agents/src/voice/turn_config/utils.ts (1)

26-26: Handle case where both voiceOptions and sessionOptions are undefined.

When both voiceOptions and sessionOptions are undefined, spreading them results in { ...undefined, ...undefined } which evaluates to {}. While this works, calling structuredClone on an empty object and then accessing mergedOptions.turnHandling on line 40 could produce unexpected results if turnHandling is expected to be undefined.

🛡️ Defensive handling suggestion
-  const mergedOptions = structuredClone({ ...voiceOptions, ...sessionOptions });
+  const mergedOptions = structuredClone({ ...(voiceOptions ?? {}), ...(sessionOptions ?? {}) });
agents/src/inference/interruption/ws_transport.ts (2)

114-119: Empty catch block silently swallows parse errors.

While the warning is logged, the empty catch block at line 117 discards the original error information. Consider including the error in the log for debugging purposes.

🔍 Proposed improvement
       try {
         const message = wsMessageSchema.parse(JSON.parse(data.toString()));
         handleMessage(message);
-      } catch {
-        logger.warn({ data: data.toString() }, 'Failed to parse WebSocket message');
+      } catch (err) {
+        logger.warn({ data: data.toString(), err }, 'Failed to parse WebSocket message');
       }

315-318: reconnect() doesn't actively reconnect.

The reconnect() method only calls close() and relies on a subsequent ensureConnection() call to re-establish the connection. However, there's no guarantee that ensureConnection() will be called after reconnect(). If this is intentional (lazy reconnection on next transform), the comment should clarify this behavior. Otherwise, consider awaiting ensureConnection() within reconnect().

♻️ Option A: Active reconnection
   async function reconnect(): Promise<void> {
     close();
-    // Connection will be re-established on next ensureConnection call
+    await ensureConnection();
   }
♻️ Option B: Clarify lazy reconnection in comment
   async function reconnect(): Promise<void> {
     close();
-    // Connection will be re-established on next ensureConnection call
+    // Connection will be lazily re-established on next transform() call via ensureConnection()
+    // This is intentional to avoid reconnecting if the stream is being closed
   }
agents/src/inference/interruption/utils.ts (2)

43-60: Type safety concern in setOrUpdate with Record<string, unknown> cast.

The cast to Record<string, unknown> on line 56 bypasses TypeScript's type checking. While this is acceptable for the current use case, it could allow setting invalid field names. Consider adding runtime validation or documenting the expected usage.


127-139: Performance: O(n×w) complexity for sliding window calculation.

The slidingWindowMinMax function creates a new array slice for each window position (line 135), resulting in O(n×w) time complexity. For large probability arrays, consider using a monotonic deque for O(n) complexity.

For current use cases with small window sizes (likely 2-3 frames based on MIN_INTERRUPTION_DURATION_IN_S), this is acceptable but may become a bottleneck with longer audio durations.

agents/src/inference/interruption/interruption_detector.ts (2)

25-26: Public options property exposes mutable internal state.

The options property is public, allowing external code to directly mutate the detector's configuration without triggering updateOptions(). Consider making it readonly or private with a getter.

♻️ Encapsulate options
 export class AdaptiveInterruptionDetector extends (EventEmitter as new () => TypedEventEmitter<InterruptionCallbacks>) {
-  options: InterruptionOptions;
+  private _options: InterruptionOptions;
   private readonly _label: string;
   // ... later in constructor ...
-    this.options = {
+    this._options = {
   // ... add getter ...
+  get options(): Readonly<InterruptionOptions> {
+    return this._options;
+  }

81-94: Duplicate minFrames calculation logic.

The formula Math.ceil(minInterruptionDurationInS * FRAMES_PER_SECOND) appears here and in defaults.ts (line 42) and updateOptions (line 179). Consider extracting this to a shared helper function to ensure consistency.

♻️ Extract helper function

Add to defaults.ts:

export function computeMinFrames(minInterruptionDurationInS: number): number {
  return Math.ceil(minInterruptionDurationInS * FRAMES_PER_SECOND);
}

Then use it consistently across files.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0bb2745 and 96d6b57.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (27)
  • .changeset/config.json
  • .github/workflows/test.yml
  • agents/package.json
  • agents/src/inference/interruption/defaults.ts
  • agents/src/inference/interruption/errors.ts
  • agents/src/inference/interruption/http_transport.ts
  • agents/src/inference/interruption/interruption_cache_entry.ts
  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/inference/interruption/types.ts
  • agents/src/inference/interruption/utils.test.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/inference/interruption/ws_transport.ts
  • agents/src/stream/stream_channel.ts
  • agents/src/telemetry/trace_types.ts
  • agents/src/voice/agent.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/events.ts
  • agents/src/voice/turn_config/endpointing.ts
  • agents/src/voice/turn_config/interruption.ts
  • agents/src/voice/turn_config/turn_handling.ts
  • agents/src/voice/turn_config/utils.test.ts
  • agents/src/voice/turn_config/utils.ts
  • examples/package.json
  • examples/src/basic_agent.ts
💤 Files with no reviewable changes (1)
  • examples/package.json
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/inference/interruption/utils.test.ts
  • agents/src/voice/turn_config/utils.test.ts
  • agents/src/voice/turn_config/turn_handling.ts
  • agents/src/voice/turn_config/endpointing.ts
  • agents/src/voice/turn_config/interruption.ts
  • examples/src/basic_agent.ts
  • agents/src/inference/interruption/ws_transport.ts
  • agents/src/inference/interruption/interruption_cache_entry.ts
  • agents/src/inference/interruption/errors.ts
  • agents/src/inference/interruption/types.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/stream/stream_channel.ts
  • agents/src/voice/turn_config/utils.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/voice/events.ts
  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent.ts
  • agents/src/inference/interruption/http_transport.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/telemetry/trace_types.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/inference/interruption/utils.test.ts
  • agents/src/voice/turn_config/utils.test.ts
  • agents/src/voice/turn_config/turn_handling.ts
  • agents/src/voice/turn_config/endpointing.ts
  • agents/src/voice/turn_config/interruption.ts
  • examples/src/basic_agent.ts
  • agents/src/inference/interruption/ws_transport.ts
  • agents/src/inference/interruption/interruption_cache_entry.ts
  • agents/src/inference/interruption/errors.ts
  • agents/src/inference/interruption/types.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/stream/stream_channel.ts
  • agents/src/voice/turn_config/utils.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/voice/events.ts
  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent.ts
  • agents/src/inference/interruption/http_transport.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/telemetry/trace_types.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/inference/interruption/utils.test.ts
  • agents/src/voice/turn_config/utils.test.ts
  • agents/src/voice/turn_config/turn_handling.ts
  • agents/src/voice/turn_config/endpointing.ts
  • agents/src/voice/turn_config/interruption.ts
  • examples/src/basic_agent.ts
  • agents/src/inference/interruption/ws_transport.ts
  • agents/src/inference/interruption/interruption_cache_entry.ts
  • agents/src/inference/interruption/errors.ts
  • agents/src/inference/interruption/types.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/stream/stream_channel.ts
  • agents/src/voice/turn_config/utils.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/voice/events.ts
  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent.ts
  • agents/src/inference/interruption/http_transport.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/telemetry/trace_types.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
🧠 Learnings (5)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

  • .github/workflows/test.yml
  • agents/src/voice/turn_config/utils.test.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Use `pnpm build && pnpm dlx tsx ./examples/src/my_agent.ts dev|download-files --log-level=debug|info(default)` to run example agents from the examples directory

Applied to files:

  • .github/workflows/test.yml
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example|spec) : When testing inference LLM, always use full model names from `agents/src/inference/models.ts` (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Applied to files:

  • agents/src/inference/interruption/utils.test.ts
  • agents/src/voice/agent_session.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/{examples,test}/**/*.test.ts : Include both basic streaming and tool calling tests to verify full LLM functionality

Applied to files:

  • agents/src/inference/interruption/utils.test.ts
  • agents/src/voice/turn_config/utils.test.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example) : Initialize logger before using any LLM functionality with `initializeLogger({ pretty: true })` from 'livekit/agents'

Applied to files:

  • examples/src/basic_agent.ts
🧬 Code graph analysis (12)
agents/src/inference/interruption/utils.test.ts (1)
agents/src/inference/interruption/utils.ts (1)
  • slidingWindowMinMax (127-140)
agents/src/voice/turn_config/utils.test.ts (5)
agents/src/log.ts (1)
  • initializeLogger (34-40)
agents/src/voice/turn_config/utils.ts (1)
  • migrateLegacyOptions (14-63)
agents/src/voice/turn_config/turn_handling.ts (1)
  • defaultTurnHandlingConfig (41-45)
agents/src/voice/turn_config/endpointing.ts (1)
  • defaultEndpointingConfig (23-26)
agents/src/voice/turn_config/interruption.ts (1)
  • defaultInterruptionConfig (42-49)
agents/src/voice/turn_config/turn_handling.ts (3)
agents/src/voice/agent_session.ts (1)
  • TurnDetectionMode (123-123)
agents/src/voice/turn_config/endpointing.ts (2)
  • EndpointingConfig (7-21)
  • defaultEndpointingConfig (23-26)
agents/src/voice/turn_config/interruption.ts (2)
  • InterruptionConfig (7-40)
  • defaultInterruptionConfig (42-49)
agents/src/inference/interruption/interruption_cache_entry.ts (1)
agents/src/inference/interruption/utils.ts (1)
  • estimateProbability (115-125)
agents/src/stream/stream_channel.ts (1)
agents/src/stream/index.ts (3)
  • StreamChannel (7-7)
  • createStreamChannel (7-7)
  • IdentityTransform (5-5)
agents/src/voice/turn_config/utils.ts (5)
agents/src/voice/agent_session.ts (2)
  • AgentSessionOptions (139-152)
  • InternalSessionOptions (109-111)
agents/src/voice/turn_config/turn_handling.ts (2)
  • TurnHandlingConfig (12-34)
  • defaultTurnHandlingConfig (41-45)
agents/src/voice/agent_activity.ts (1)
  • turnDetection (379-382)
agents/src/voice/turn_config/endpointing.ts (1)
  • defaultEndpointingConfig (23-26)
agents/src/voice/turn_config/interruption.ts (1)
  • defaultInterruptionConfig (42-49)
agents/src/inference/interruption/interruption_detector.ts (2)
agents/src/inference/interruption/types.ts (1)
  • InterruptionOptions (33-46)
agents/src/inference/interruption/interruption_stream.ts (2)
  • InterruptionStreamBase (77-375)
  • stream (338-340)
agents/src/voice/agent.ts (3)
agents/src/voice/agent_session.ts (1)
  • TurnDetectionMode (123-123)
agents/src/voice/turn_config/turn_handling.ts (1)
  • TurnHandlingConfig (12-34)
agents/src/voice/turn_config/interruption.ts (1)
  • InterruptionConfig (7-40)
agents/src/inference/interruption/http_transport.ts (4)
agents/src/inference/interruption/defaults.ts (1)
  • intervalForRetry (28-37)
agents/src/inference/interruption/interruption_cache_entry.ts (1)
  • InterruptionCacheEntry (10-47)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/inference/utils.ts (1)
  • createAccessToken (10-19)
agents/src/inference/interruption/interruption_stream.ts (4)
agents/src/inference/interruption/types.ts (9)
  • AgentSpeechStarted (59-61)
  • AgentSpeechEnded (63-65)
  • OverlapSpeechStarted (67-71)
  • OverlapSpeechEnded (73-75)
  • Flush (77-79)
  • InterruptionSentinel (84-89)
  • InterruptionEvent (17-28)
  • InterruptionOptions (33-46)
  • ApiConnectOptions (51-55)
agents/src/inference/interruption/interruption_cache_entry.ts (1)
  • InterruptionCacheEntry (10-47)
agents/src/inference/interruption/defaults.ts (2)
  • apiConnectDefaults (18-22)
  • FRAMES_PER_SECOND (14-14)
agents/src/inference/interruption/utils.ts (1)
  • BoundedCache (10-109)
agents/src/voice/agent_session.ts (5)
agents/src/voice/turn_config/turn_handling.ts (2)
  • TurnHandlingConfig (12-34)
  • InternalTurnHandlingConfig (36-39)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/voice/run_context.ts (1)
  • UnknownUserData (8-8)
agents/src/voice/turn_config/interruption.ts (1)
  • InterruptionConfig (7-40)
agents/src/voice/turn_config/utils.ts (1)
  • migrateLegacyOptions (14-63)
agents/src/voice/agent_activity.ts (4)
agents/src/voice/audio_recognition.ts (1)
  • RecognitionHooks (42-53)
agents/src/voice/agent_session.ts (1)
  • TurnDetectionMode (123-123)
agents/src/inference/interruption/interruption_detector.ts (1)
  • AdaptiveInterruptionDetector (25-189)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
🪛 GitHub Actions: Test
examples/src/basic_agent.ts

[error] 61-61: TS2353: Object literal may only specify known properties, and 'turnHandling' does not exist in type 'AgentSessionOptions'.

🔇 Additional comments (34)
.changeset/config.json (1)

11-11: Formatting-only change looks good.

Line 11 keeps the fixed package groups intact; no behavioral change expected.

agents/src/telemetry/trace_types.ts (1)

54-59: Telemetry attribute additions look consistent.

Line 54-59 follow the existing lk.* naming and section placement.

agents/package.json (1)

72-72: No action needed — ofetch 1.5.1 provides CommonJS compatibility.

ofetch publishes both ESM and CJS builds with proper conditional exports. The CJS entry point (./dist/node.cjs) is configured in package exports and the legacy main field for broader resolver compatibility. No require() consumers will break.

agents/src/inference/interruption/types.ts (1)

6-89: Well-structured type definitions.

The enum, interfaces, and union types are clearly documented and provide a clean contract for the interruption detection system. The sentinel types using discriminated unions (type property) enable type-safe stream control signal handling.

agents/src/inference/interruption/utils.test.ts (1)

7-30: Good test coverage for slidingWindowMinMax.

Tests cover all important edge cases: empty arrays, arrays shorter than window, window size of 1, and boundary conditions where array length equals window size. The inline comments clearly explain the expected calculations.

agents/src/voice/turn_config/interruption.ts (1)

7-49: Well-designed configuration interface.

The InterruptionConfig interface is thoroughly documented with clear JSDoc comments explaining each field's purpose and default value. The as const satisfies InterruptionConfig pattern ensures type safety while preserving literal types in the default object.

agents/src/voice/turn_config/utils.test.ts (2)

11-13: Good practice: Logger initialization in beforeAll.

Properly initializes the logger before tests run, which aligns with the coding guidelines for test files.


15-100: Comprehensive migration test coverage.

The tests thoroughly verify:

  • Default configuration when no options provided
  • Legacy flat field migration to nested turnHandling structure
  • allowInterruptions: false correctly sets interruption.mode to false
  • Precedence rules between options and voiceOptions
  • Explicit turnHandling overriding legacy fields
  • Preservation of turnDetection at both levels
agents/src/voice/turn_config/turn_handling.ts (1)

8-45: Clean type hierarchy for turn handling configuration.

Good design separating TurnHandlingConfig (public API with Partial<> for optional overrides) from InternalTurnHandlingConfig (internal use with fully concrete types). The JSDoc clearly documents the turn detection strategies and fallback behavior.

agents/src/inference/interruption/errors.ts (1)

7-25: Well-designed custom error class.

The InterruptionDetectionError provides useful metadata (timestamp, label, recoverable) for error handling and debugging. The recoverable flag allows callers to determine whether to retry or fail permanently.

agents/src/inference/interruption/defaults.ts (2)

1-6: LGTM on SPDX headers and imports.

The file correctly includes the SPDX copyright and license headers. Consider verifying that ApiConnectOptions is intentionally imported from interruption_stream.js rather than types.ts, which also defines this interface per the relevant code snippets.


28-37: Exponential backoff implementation looks correct.

The intervalForRetry function correctly implements exponential backoff with jitter (0-25% randomization). The formula baseInterval * 2^attempt + jitter aligns with standard retry patterns.

agents/src/voice/turn_config/utils.ts (3)

1-3: LGTM on SPDX headers.


65-68: LGTM on stripUndefined implementation.

Clean and effective utility for preventing undefined values from shadowing defaults during object spread.


70-76: LGTM on mergeWithDefaults implementation.

Correctly applies defaults while using stripUndefined to preserve default values when config fields are undefined.

agents/src/inference/interruption/ws_transport.ts (2)

320-350: LGTM on TransformStream implementation.

The transform logic correctly handles both Int16Array audio chunks and passthrough of InterruptionEvent objects. Error handling and cleanup via flush() are properly implemented.


1-3: ⚠️ Potential issue | 🟡 Minor

Update copyright year to 2026.

The SPDX copyright year is 2025 but should be 2026 per coding guidelines.

📝 Proposed fix
-// SPDX-FileCopyrightText: 2025 LiveKit, Inc.
+// SPDX-FileCopyrightText: 2026 LiveKit, Inc.
 //
 // SPDX-License-Identifier: Apache-2.0
⛔ Skipped due to learnings
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'
agents/src/inference/interruption/interruption_cache_entry.ts (3)

1-5: LGTM on SPDX headers and imports.


10-35: LGTM on class structure and constructor.

The mutable design is intentionally documented to support the setOrUpdate pattern from the cache. Constructor defaults are sensible with numeric durations defaulting to 0.


37-46: LGTM on probability getter and default factory.

The getter correctly delegates to estimateProbability which handles edge cases (returns 0 for insufficient data per the relevant code snippet showing if (probabilities.length < minWindow) { return 0; }).

agents/src/inference/interruption/utils.ts (3)

1-4: LGTM on SPDX headers and imports.


10-25: LGTM on BoundedCache set/get implementation.

The FIFO eviction strategy is correctly implemented by removing the first inserted key when size exceeds the limit.


115-125: LGTM on estimateProbability.

Correctly handles edge cases by returning 0 when there's insufficient data for the minimum window size.

agents/src/voice/agent.ts (2)

32-33: LGTM on new type imports and AgentOptions additions.

The deprecation annotation on turnDetection and addition of turnHandling properly signals the migration path to users.

Also applies to: 68-69, 76-77


194-200: LGTM on new getters.

The interruptionDetection and allowInterruptions getters correctly expose the internal state derived from the turn handling configuration.

agents/src/inference/interruption/interruption_detector.ts (4)

1-15: LGTM on SPDX headers and imports.


47-49: Validation message is slightly misleading.

The condition maxAudioDurationInS > 3.0 rejects values strictly greater than 3.0, but the error message says "must be less than or equal to 3.0". This is correct behavior but the phrasing could be clearer by explicitly stating the limit.


170-188: LGTM on updateOptions implementation.

The method correctly propagates option updates to all active streams and awaits all promises with Promise.all(). This ensures consistent configuration across all streams.


153-157: Usage is correct and intentional.

The createStream() method passing an empty object {} as the second argument to InterruptionStreamBase is valid and by design. The constructor accepts apiOptions: Partial<ApiConnectOptions>, which makes all properties optional. The implementation merges this with apiConnectDefaults, allowing callers to pass an empty object to use all defaults. This is an idiomatic TypeScript pattern and requires no changes.

Likely an incorrect or invalid review comment.

agents/src/inference/interruption/http_transport.ts (1)

105-174: LGTM: transport keeps stream flowing even on inference failures.

Graceful error handling and event emission look solid.

agents/src/voice/audio_recognition.ts (1)

851-914: LGTM: interruption task wiring is clean and abort-safe.

The split between forwardTask and eventReader with abort handling looks solid.

agents/src/inference/interruption/interruption_stream.ts (1)

158-319: LGTM: two-stage pipeline cleanly separates framing and transport.

The audioTransformer -> transport -> eventEmitter composition is easy to reason about.

agents/src/voice/agent_session.ts (1)

73-265: LGTM: legacy option migration keeps API compatibility while moving to SessionOptions.

The constructor flow and defaults look consistent with the new turn-handling config.

agents/src/voice/agent_activity.ts (1)

307-318: Nice: interruption detector is cleanly threaded into AudioRecognition setup.

The wiring keeps the recognition pipeline cohesive.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +302 to +329
private shouldHoldSttEvent(ev: SpeechEvent): boolean {
if (!this.isInterruptionEnabled) {
return false;
}
if (this.isAgentSpeaking) {
return true;
}

if (this.ignoreUserTranscriptUntil === undefined) {
return false;
}
// sentinel events are always held until we have something concrete to release them
if (!ev.alternatives || ev.alternatives.length === 0) {
return true;
}

const alternative = ev.alternatives[0];

if (
this.inputStartedAt &&
alternative.startTime !== alternative.endTime &&
alternative.endTime > 0 &&
alternative.endTime + this.inputStartedAt < this.ignoreUserTranscriptUntil
) {
return true;
}
return false;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's check if the file exists and explore its structure
git ls-files | grep -i audio_recognition

Repository: livekit/agents-js

Length of output: 98


🏁 Script executed:

# Look at the full audio_recognition.ts file to understand the class structure
wc -l agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 103


🏁 Script executed:

# Get the file to see where inputStartedAt is declared and used
rg "inputStartedAt" agents/src/voice/audio_recognition.ts -B 2 -A 2

Repository: livekit/agents-js

Length of output: 822


🏁 Script executed:

# Check the flushHeldTranscripts method mentioned in the comment
rg "flushHeldTranscripts" agents/src/voice/audio_recognition.ts -B 5 -A 15

Repository: livekit/agents-js

Length of output: 2120


🏁 Script executed:

# Look at the onSTTEvent method to see the current implementation
rg "private async onSTTEvent|private onSTTEvent" agents/src/voice/audio_recognition.ts -A 30

Repository: livekit/agents-js

Length of output: 1190


🏁 Script executed:

# Search for all assignments/initializations of inputStartedAt
rg "inputStartedAt\s*=" agents/src/voice/audio_recognition.ts -B 2 -A 2

Repository: livekit/agents-js

Length of output: 43


🏁 Script executed:

# Check the constructor and initialization section
rg "constructor|private inputStartedAt" agents/src/voice/audio_recognition.ts -A 20 | head -50

Repository: livekit/agents-js

Length of output: 1386


🏁 Script executed:

# Look for any place where inputStartedAt might be set in event handlers or methods
rg "this\.inputStartedAt" agents/src/voice/audio_recognition.ts -B 1 -A 1

Repository: livekit/agents-js

Length of output: 486


🏁 Script executed:

# Check if there's a reset or clear method that might initialize it
rg "reset|clear|initialize" agents/src/voice/audio_recognition.ts -i | grep -i "inputstarted\|method"

Repository: livekit/agents-js

Length of output: 43


🏁 Script executed:

# Check what class this is and if there's inheritance
rg "class AudioRecognition|export class|extends" agents/src/voice/audio_recognition.ts -A 5 | head -20

Repository: livekit/agents-js

Length of output: 262


🏁 Script executed:

# Look for where ignoreUserTranscriptUntil is set to understand the pattern
rg "ignoreUserTranscriptUntil\s*=" agents/src/voice/audio_recognition.ts -B 3 -A 3

Repository: livekit/agents-js

Length of output: 1352


🏁 Script executed:

# Check if there's any method that might set inputStartedAt indirectly
rg "Date\.now|endTime|startTime" agents/src/voice/audio_recognition.ts | head -20

Repository: livekit/agents-js

Length of output: 1072


🏁 Script executed:

# Look for the complete flushHeldTranscripts method to see the full logic
sed -n '600,700p' agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 4142


🏁 Script executed:

# Let's view the method around line 302-329 to see shouldHoldSttEvent in full context
sed -n '300,350p' agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 1431


🏁 Script executed:

# Get the complete onSTTEvent method
sed -n '220,320p' agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 2823


🏁 Script executed:

# Check if there are any other places that might be responsible for initializing inputStartedAt
rg "reset|clear|interrupt" agents/src/voice/audio_recognition.ts -i -B 2 -A 5 | head -60

Repository: livekit/agents-js

Length of output: 2378


🏁 Script executed:

# Look at the method around line 220 to understand when onSTTEvent is called
sed -n '180,230p' agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 1663


🏁 Script executed:

# Check if inputStartedAt is ever reset/cleared or reassigned
rg "inputStartedAt\s*(=|:)" agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 43


🏁 Script executed:

# Let's see the full onSTTEvent method to understand the flow better
sed -n '330,450p' agents/src/voice/audio_recognition.ts

Repository: livekit/agents-js

Length of output: 4448


🏁 Script executed:

# Check if there's any event listener or callback that might be setting timestamps
rg "startTime|endTime" agents/src/voice/audio_recognition.ts -B 1 -A 1 | head -40

Repository: livekit/agents-js

Length of output: 803


🏁 Script executed:

# Verify the complete picture - check what SpeechEvent.alternatives[].endTime represents (its units/type)
rg "interface SpeechEvent|type SpeechEvent" agents/src -A 20 | head -50

Repository: livekit/agents-js

Length of output: 3458


🏁 Script executed:

# Check STT implementations to understand the endTime contract
fd -e ts -e tsx | xargs rg "endTime" | grep -i "speech\|stt\|transcript" | head -20

Repository: livekit/agents-js

Length of output: 1354


🏁 Script executed:

# Search for SpeechEvent definition
rg "class SpeechEvent|interface.*SpeechEvent|type.*SpeechEvent" agents/src -A 15

Repository: livekit/agents-js

Length of output: 16679


Initialize inputStartedAt on the first STT event with valid timestamps; otherwise held transcripts get cleared and timestamp-based filtering never works.

inputStartedAt is never set, so flushHeldTranscripts() clears the buffer and shouldHoldSttEvent() cannot compare transcript timestamps. Set it on the first STT event with valid endTime to establish the reference point for timestamp calculations.

🛠️ Suggested fix (adjust units as needed)
   private async onSTTEvent(ev: SpeechEvent) {
+    const firstAlt = ev.alternatives?.[0];
+    if (this.inputStartedAt === undefined && firstAlt?.endTime && firstAlt.endTime > 0) {
+      // If endTime is already in ms, remove the * 1000.
+      this.inputStartedAt = Date.now() - firstAlt.endTime * 1000;
+    }
     if (
       this.turnDetectionMode === 'manual' &&
🤖 Prompt for AI Agents
In `@agents/src/voice/audio_recognition.ts` around lines 302 - 329, The function
shouldHoldSttEvent never initializes this.inputStartedAt, so timestamp-based
holding/clearing never works; when you detect the first STT event with valid
timestamps (e.g., ev.alternatives[0] exists, startTime !== endTime and endTime >
0) set this.inputStartedAt to a reference epoch that aligns the event's endTime
with now (for example: this.inputStartedAt = Date.now() - alternative.endTime *
1000 if endTime is in seconds, or use Date.now() / 1000 - alternative.endTime if
you keep seconds) so subsequent comparisons in shouldHoldSttEvent and
flushHeldTranscripts use the same timebase; place this initialization inside
shouldHoldSttEvent before any timestamp comparisons when inputStartedAt is
undefined.

Comment on lines +28 to +29
UserInterruptionDetected = 'user_interruption_detected',
UserNonInterruptionDetected = 'user_non_interruption_detected',
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

rg -n -C3 "user_interruption_detected|UserInterruptionDetected" agents/src

Repository: livekit/agents-js

Length of output: 3493


🏁 Script executed:

rg -n -C3 "user_non_interruption_detected|UserNonInterruptionDetected" agents/src

Repository: livekit/agents-js

Length of output: 3209


🏁 Script executed:

# Also check the events.ts file to understand the event structure
cat -n agents/src/voice/events.ts

Repository: livekit/agents-js

Length of output: 8320


Add InterruptionEvent type definition and payload typings to events.ts.

The new UserInterruptionDetected and UserNonInterruptionDetected events (lines 28-29) are emitted through the public AgentSession API, but their payload type (InterruptionEvent) is not integrated into events.ts. Following the established pattern for other events in this file:

  • Import or define the InterruptionEvent type
  • Add createUserInterruptionDetectedEvent and createUserNonInterruptionDetectedEvent helper functions
  • Add both event types to the AgentEvent union (currently lines 254-263)
🤖 Prompt for AI Agents
In `@agents/src/voice/events.ts` around lines 28 - 29, The file is missing the
InterruptionEvent type integration: import or define InterruptionEvent, add
factory helpers createUserInterruptionDetectedEvent and
createUserNonInterruptionDetectedEvent mirroring the pattern of existing event
creators, and include the two new event types (UserInterruptionDetected and
UserNonInterruptionDetected) in the AgentEvent union so the public AgentSession
API payloads are properly typed; reference the existing event creator functions
and AgentEvent union to match shape and typing conventions.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
agents/src/voice/audio_recognition.ts (1)

981-989: ⚠️ Potential issue | 🟠 Major

close() does not cancel interruptionTask, causing inconsistent cleanup.

sttTask, vadTask, and bounceEOUTask are cancelled, but interruptionTask is only cancelled in stop(). If close() is called directly, the interruption task may continue running with a closed channel, leading to resource leaks or errors.

🛠️ Suggested fix
   async close() {
     this.detachInputAudioStream();
     this.silenceAudioWriter.releaseLock();
     await this.commitUserTurnTask?.cancelAndWait();
     await this.sttTask?.cancelAndWait();
     await this.vadTask?.cancelAndWait();
     await this.bounceEOUTask?.cancelAndWait();
+    await this.interruptionTask?.cancelAndWait();
     await this.interruptionStreamChannel.close();
   }
examples/src/basic_agent.ts (1)

4-43: ⚠️ Potential issue | 🟡 Minor

Initialize the logger before any LLM usage.

llm.tool is used during agent construction (line 33), but initializeLogger is never called and log() is instantiated after the agent is created (line 43). Call initializeLogger({ pretty: true }) before creating the agent.

Suggested fix
 import {
   type JobContext,
   type JobProcess,
   WorkerOptions,
   cli,
   defineAgent,
   llm,
   log,
+  initializeLogger,
   metrics,
   voice,
 } from '@livekit/agents';
@@
   entry: async (ctx: JobContext) => {
+    initializeLogger({ pretty: true });
+    const logger = log();
+
     const agent = new voice.Agent({
       instructions:
         "You are a helpful assistant, you can hear the user's message and respond to it.",
@@
-    const logger = log();
🤖 Fix all issues with AI agents
In `@agents/src/voice/audio_recognition.ts`:
- Around line 897-903: The two messages logged with this.logger.warn in the
interruption event loop are routine and should be lowered to debug/trace level;
locate the block where Promise.race([eventReader.read(), abortPromise]) is
awaited (the code that calls eventReader.read(), checks done/value and then
calls this.onInterruptionEvent(ev)) and replace this.logger.warn('waiting for
interruption event') and this.logger.warn('got interruption event') with
this.logger.debug (or this.logger.trace) so normal operation doesn't produce
warnings in production logs.
🧹 Nitpick comments (1)
agents/src/voice/audio_recognition.ts (1)

851-914: Interruption task structure is sound but cleanup ordering could be clearer.

The task correctly forwards frames to the detector stream and consumes output events. The cleanup function references eventReader defined later (line 887), which works due to closure semantics but is mildly confusing. Consider moving the declaration earlier or inlining cleanup.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96d6b57 and 2ee2748.

📒 Files selected for processing (3)
  • agents/src/voice/agent_session.ts
  • agents/src/voice/audio_recognition.ts
  • examples/src/basic_agent.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_session.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_session.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_session.ts
🧠 Learnings (2)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example) : Initialize logger before using any LLM functionality with `initializeLogger({ pretty: true })` from 'livekit/agents'

Applied to files:

  • examples/src/basic_agent.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example|spec) : When testing inference LLM, always use full model names from `agents/src/inference/models.ts` (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Applied to files:

  • examples/src/basic_agent.ts
  • agents/src/voice/agent_session.ts
🧬 Code graph analysis (3)
examples/src/basic_agent.ts (3)
agents/src/voice/agent.ts (1)
  • session (190-192)
plugins/openai/src/realtime/realtime_model.ts (1)
  • session (305-307)
agents/src/index.ts (1)
  • voice (39-39)
agents/src/voice/audio_recognition.ts (6)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/inference/interruption/interruption_detector.ts (1)
  • AdaptiveInterruptionDetector (25-189)
agents/src/utils.ts (3)
  • waitForAbort (853-862)
  • done (141-143)
  • done (525-527)
agents/src/stt/stt.ts (1)
  • SpeechEvent (65-70)
agents/src/stream/stream_channel.ts (2)
  • StreamChannel (7-14)
  • createStreamChannel (16-60)
agents/src/inference/interruption/interruption_stream.ts (2)
  • InterruptionStreamSentinel (41-64)
  • stream (338-340)
agents/src/voice/agent_session.ts (5)
agents/src/voice/turn_config/turn_handling.ts (2)
  • TurnHandlingConfig (12-34)
  • InternalTurnHandlingConfig (36-39)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/voice/run_context.ts (1)
  • UnknownUserData (8-8)
agents/src/voice/turn_config/interruption.ts (1)
  • InterruptionConfig (7-40)
agents/src/voice/turn_config/utils.ts (1)
  • migrateLegacyOptions (14-63)
🔇 Additional comments (6)
agents/src/voice/agent_session.ts (3)

73-118: Well-structured SessionOptions with proper deprecation path.

The new SessionOptions interface cleanly separates turn handling configuration under turnHandling, while maintaining backward compatibility through deprecated fields with clear JSDoc migration hints. The InternalSessionOptions extension ensures runtime access to fully resolved config.


217-268: Clean migration and initialization flow.

The constructor properly leverages migrateLegacyOptions to normalize input, then destructures sessionOptions to populate both turnDetection and _interruptionDetection. The assignment this.options = opts.options at line 264 correctly references the migrated InternalSessionOptions.


135-136: New interruption events properly typed.

The UserInterruptionDetected and UserNonInterruptionDetected callbacks correctly use InterruptionEvent from inference/interruption/types.ts, matching the event structure that includes type, timestamp, isInterruption, and probability fields.

agents/src/voice/audio_recognition.ts (3)

133-144: Input stream tee setup correctly supports three consumers.

The triple tee (vadInputStream, inputStream for interruption, sttInputStream) properly distributes the deferred input to VAD, interruption detection, and STT pipelines. The interruption channel is initialized and fed via addStreamInput.


182-233: Lifecycle hooks for agent speech events are well-structured.

onStartOfAgentSpeech, onEndOfAgentSpeech, onStartOfOverlapSpeech, and onEndOfOverlapSpeech properly coordinate interruption state and sentinel forwarding. The logic in onEndOfAgentSpeech correctly handles the case where no overlap occurred.


285-286: Transcript flush gating now correctly handles index 0.

The condition emitFromIndex !== null && shouldFlush properly allows slicing from index 0, fixing the previous issue where emitFromIndex && shouldFlush !== undefined treated 0 as falsy.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
agents/src/voice/agent_activity.ts (1)

710-765: ⚠️ Potential issue | 🟠 Major

Inference interruption events shouldn’t be blocked by minWords gating.

onInterruption() routes through interruptByAudioActivity(), which applies the STT word-count filter. That can suppress a confirmed inference interruption (e.g., a one-word “stop”). Consider a bypass flag for inference-driven interrupts.

🔧 Suggested fix
-  private interruptByAudioActivity(): void {
+  private interruptByAudioActivity(options: { skipMinWords?: boolean } = {}): void {
...
-    if (
-      this.stt &&
-      this.agentSession.options.turnHandling.interruption?.minWords > 0 &&
-      this.audioRecognition
-    ) {
+    if (
+      !options.skipMinWords &&
+      this.stt &&
+      this.agentSession.options.turnHandling.interruption?.minWords > 0 &&
+      this.audioRecognition
+    ) {
...
   onInterruption(ev: InterruptionEvent) {
     this.restoreInterruptionByAudioActivity();
-    this.interruptByAudioActivity();
+    this.interruptByAudioActivity({ skipMinWords: true });
     if (this.audioRecognition) {
       this.audioRecognition.onEndOfAgentSpeech(ev.overlapSpeechStartedAt || ev.timestamp);
     }
   }
agents/src/voice/audio_recognition.ts (1)

112-145: ⚠️ Potential issue | 🟠 Major

Conditionally create interruption stream channel only when detection is enabled.

Lines 133–145 unconditionally create and feed the interruption stream channel with a tee branch from the audio input, even when interruptionDetection is undefined. If the channel has no reader (which occurs when createInterruptionTask returns early at line 855), the channel's queue can fill and cause backpressure on the tee, stalling VAD/STT audio processing.

Make interruptionStreamChannel optional and only create/feed it when opts.interruptionDetection is truthy. Add existence checks at lines 334 and 858 before accessing the channel.

This applies to all three locations: lines 133–145 (tee/channel setup), lines 331–345 (trySendInterruptionSentinel), and lines 851–859 (createInterruptionTask).

🤖 Fix all issues with AI agents
In `@agents/src/inference/interruption/defaults.ts`:
- Around line 1-3: Update the SPDX header in defaults.ts to use the required
2025 year: replace the top comment line that currently reads "//
SPDX-FileCopyrightText: 2026 LiveKit, Inc." with "// SPDX-FileCopyrightText:
2025 LiveKit, Inc." (leave the existing "// SPDX-License-Identifier: Apache-2.0"
unchanged) so the file conforms to the project's SPDX header rule for new TS
files.

In `@agents/src/inference/interruption/interruption_detector.ts`:
- Around line 1-3: The SPDX header in the file currently reads "//
SPDX-FileCopyrightText: 2026 LiveKit, Inc." (with "//
SPDX-FileCopyright-Identifier: Apache-2.0") — update the year in the
SPDX-FileCopyrightText header from 2026 to 2025 so it reads "//
SPDX-FileCopyrightText: 2025 LiveKit, Inc." while keeping the
SPDX-FileCopyrightIdentifier line unchanged; this fixes the header at the top of
interruption_detector.ts.
- Around line 37-59: The merged destructuring sets baseUrl from
interruptionOptionDefaults so env overrides never apply; change computation of
lkBaseUrl to derive from the raw options first (e.g., use options.baseUrl ??
process.env.LIVEKIT_REMOTE_EOT_URL ?? getDefaultInferenceUrl()) instead of using
the already-merged baseUrl, or alternatively move getDefaultInferenceUrl() into
interruptionOptionDefaults so defaults behave as intended; update the lkBaseUrl
assignment and keep the rest of the logic (isInferenceUrl, lkApiKey,
lkApiSecret, useProxy) unchanged, referencing interruptionOptionDefaults,
options, and lkBaseUrl in interruption_detector.ts.

In `@agents/src/inference/interruption/interruption_stream.ts`:
- Around line 1-3: Update the SPDX header in the top of the file by changing the
year from 2026 to 2025 so the two header lines read the required values
(SPDX-FileCopyrightText year and SPDX-FileCopyright-Identifier). Locate the
existing header lines at the top of interruption_stream.ts (the lines starting
with "// SPDX-FileCopyrightText:" and "// SPDX-License-Identifier:") and edit
the year token to 2025 to comply with the repository guideline.

In `@agents/src/inference/interruption/utils.ts`:
- Around line 92-101: The calculation of minWindow inside estimateProbability
can become 0 when windowSizeInS ≤ 0, causing slidingWindowMinMax to return
Infinity; update estimateProbability to clamp minWindow to a minimum of 1 (or
throw on non‑positive windowSizeInS) before using it—i.e., after computing
minWindow = Math.ceil(windowSizeInS / FRAME_DURATION_IN_S) ensure minWindow =
Math.max(1, minWindow) (or validate windowSizeInS and reject non‑positive
values) so slidingWindowMinMax never receives 0.
- Around line 1-3: The SPDX header year at the top of the new TS file is
incorrect (2026); update the top comment to use 2025 by replacing the existing
'// SPDX-FileCopyrightText: 2026 LiveKit, Inc.' line with '//
SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and leave '//
SPDX-License-Identifier: Apache-2.0' as-is so the file complies with the
repository SPDX header guideline.

In `@agents/src/voice/agent_activity.ts`:
- Around line 2459-2492: resolveInterruptionDetector currently instantiates
AdaptiveInterruptionDetector whenever the configuration is compatible, which
causes startup errors when interruption.mode is undefined; change the logic so
the detector is only created when interruptionDetection is explicitly 'adaptive'
(i.e. check interruptionDetection === 'adaptive' before new
AdaptiveInterruptionDetector()). Update the return path to return undefined for
all other cases; touch the resolveInterruptionDetector method and the
interruptionDetection local variable checks so that AdaptiveInterruptionDetector
is only constructed on an explicit 'adaptive' setting.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ee2748 and e28b1b1.

📒 Files selected for processing (8)
  • agents/src/inference/interruption/defaults.ts
  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/inference/interruption/ws_transport.ts
  • agents/src/stream/stream_channel.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/audio_recognition.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • agents/src/stream/stream_channel.ts
  • agents/src/inference/interruption/ws_transport.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/inference/interruption/interruption_detector.ts
  • agents/src/inference/interruption/defaults.ts
  • agents/src/inference/interruption/utils.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
🧬 Code graph analysis (3)
agents/src/inference/interruption/interruption_detector.ts (4)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/inference/interruption/errors.ts (1)
  • InterruptionDetectionError (7-25)
agents/src/inference/interruption/interruption_stream.ts (2)
  • InterruptionStreamBase (77-376)
  • stream (338-340)
agents/src/inference/interruption/defaults.ts (6)
  • interruptionOptionDefaults (69-82)
  • getDefaultInferenceUrl (31-46)
  • DEFAULT_BASE_URL (18-18)
  • STAGING_BASE_URL (21-21)
  • SAMPLE_RATE (13-13)
  • FRAMES_PER_SECOND (14-14)
agents/src/inference/interruption/defaults.ts (1)
agents/src/inference/interruption/types.ts (2)
  • ApiConnectOptions (51-55)
  • InterruptionOptions (33-46)
agents/src/voice/agent_activity.ts (5)
agents/src/voice/audio_recognition.ts (2)
  • RecognitionHooks (42-53)
  • AudioRecognition (74-1023)
agents/src/voice/agent_session.ts (2)
  • TurnDetectionMode (123-123)
  • interruptionDetection (304-306)
agents/src/inference/interruption/interruption_detector.ts (1)
  • AdaptiveInterruptionDetector (27-192)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/voice/agent.ts (1)
  • interruptionDetection (194-196)
🔇 Additional comments (1)
agents/src/inference/interruption/interruption_stream.ts (1)

372-375: Good cleanup: stream unregisters from detector on close.

Prevents stale streams from receiving updates and avoids leaks.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +92 to +101
export function estimateProbability(
probabilities: number[],
windowSizeInS: number = MIN_INTERRUPTION_DURATION_IN_S,
): number {
const minWindow = Math.ceil(windowSizeInS / FRAME_DURATION_IN_S);
if (probabilities.length < minWindow) {
return 0;
}

return slidingWindowMinMax(probabilities, minWindow);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clamp minWindow to avoid Infinity when windowSizeInS ≤ 0.

If a caller configures minInterruptionDurationInS as 0, minWindow becomes 0 and slidingWindowMinMax returns Infinity. Clamp to ≥1 (or explicitly reject non‑positive values).

🔧 Suggested fix
-  const minWindow = Math.ceil(windowSizeInS / FRAME_DURATION_IN_S);
+  const minWindow = Math.max(1, Math.ceil(windowSizeInS / FRAME_DURATION_IN_S));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function estimateProbability(
probabilities: number[],
windowSizeInS: number = MIN_INTERRUPTION_DURATION_IN_S,
): number {
const minWindow = Math.ceil(windowSizeInS / FRAME_DURATION_IN_S);
if (probabilities.length < minWindow) {
return 0;
}
return slidingWindowMinMax(probabilities, minWindow);
export function estimateProbability(
probabilities: number[],
windowSizeInS: number = MIN_INTERRUPTION_DURATION_IN_S,
): number {
const minWindow = Math.max(1, Math.ceil(windowSizeInS / FRAME_DURATION_IN_S));
if (probabilities.length < minWindow) {
return 0;
}
return slidingWindowMinMax(probabilities, minWindow);
🤖 Prompt for AI Agents
In `@agents/src/inference/interruption/utils.ts` around lines 92 - 101, The
calculation of minWindow inside estimateProbability can become 0 when
windowSizeInS ≤ 0, causing slidingWindowMinMax to return Infinity; update
estimateProbability to clamp minWindow to a minimum of 1 (or throw on
non‑positive windowSizeInS) before using it—i.e., after computing minWindow =
Math.ceil(windowSizeInS / FRAME_DURATION_IN_S) ensure minWindow = Math.max(1,
minWindow) (or validate windowSizeInS and reject non‑positive values) so
slidingWindowMinMax never receives 0.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@agents/src/voice/turn_config/utils.ts`:
- Around line 1-3: Update the SPDX header year from 2026 to 2025 by replacing
the string "// SPDX-FileCopyrightText: 2026 LiveKit, Inc." with "//
SPDX-FileCopyrightText: 2025 LiveKit, Inc." so the file's SPDX headers match the
required new-file convention alongside the existing "// SPDX-License-Identifier:
Apache-2.0" line.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e28b1b1 and 4310baa.

📒 Files selected for processing (1)
  • agents/src/voice/turn_config/utils.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/voice/turn_config/utils.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/voice/turn_config/utils.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/voice/turn_config/utils.ts
🧬 Code graph analysis (1)
agents/src/voice/turn_config/utils.ts (5)
agents/src/voice/agent_session.ts (3)
  • AgentSessionOptions (139-152)
  • InternalSessionOptions (109-111)
  • defaultSessionOptions (113-118)
agents/src/voice/agent_activity.ts (1)
  • turnDetection (379-382)
agents/src/voice/turn_config/turn_handling.ts (2)
  • TurnHandlingConfig (12-34)
  • defaultTurnHandlingConfig (41-45)
agents/src/voice/turn_config/endpointing.ts (1)
  • defaultEndpointingConfig (23-26)
agents/src/voice/turn_config/interruption.ts (1)
  • defaultInterruptionConfig (42-49)
🔇 Additional comments (3)
agents/src/voice/turn_config/utils.ts (3)

14-71: LGTM.

Migration flow, precedence rules, and turnDetection preservation look solid.


73-76: LGTM.

stripUndefined does exactly what it should and keeps defaults clean.


78-83: LGTM.

Default merging is clear and consistent with the default config constants.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Remove baseUrl and useProxy from interruptionOptionDefaults so they
are resolved dynamically in the constructor. Previously, the defaults
pre-populated baseUrl with the cloud inference URL, which prevented
the LIVEKIT_REMOTE_EOT_URL environment variable from being used.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@agents/src/inference/interruption/defaults.ts`:
- Around line 58-81: interruptionOptionDefaults currently hardcodes
inferenceTimeout to 1_000 (ms) instead of using the exported
REMOTE_INFERENCE_TIMEOUT_IN_S constant; update the inferenceTimeout property on
interruptionOptionDefaults to use REMOTE_INFERENCE_TIMEOUT_IN_S * 1000 so the
timeout remains in milliseconds and will change if REMOTE_INFERENCE_TIMEOUT_IN_S
is modified (refer to interruptionOptionDefaults and
REMOTE_INFERENCE_TIMEOUT_IN_S to locate the change).
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4310baa and 175e57b.

📒 Files selected for processing (1)
  • agents/src/inference/interruption/defaults.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/inference/interruption/defaults.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/inference/interruption/defaults.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/inference/interruption/defaults.ts
🧠 Learnings (2)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Applied to files:

  • agents/src/inference/interruption/defaults.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example|spec) : When testing inference LLM, always use full model names from `agents/src/inference/models.ts` (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Applied to files:

  • agents/src/inference/interruption/defaults.ts

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link
Contributor

@toubatbrian toubatbrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on my end and made a few minor fixes. Everything looks good to me. Given the size of this PR, I think it's best to merge it into main now so that any follow-up changes can be based directly on main, rather than on top of this. What do you think? @lukasIO

@toubatbrian
Copy link
Contributor

Actually on a second thought, since this includes interface changes, we probably shouldn’t merge it just yet.. Probably need to wait until the full BargeIn feature is ready before merging.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/src/basic_agent.ts (1)

4-42: ⚠️ Potential issue | 🟡 Minor

Initialize the logger before LLM usage.
The example creates a logger but doesn’t call initializeLogger({ pretty: true }) before LLM-related setup. Please add it at the start of entry.

✅ Suggested fix
 import {
   type JobContext,
   type JobProcess,
   WorkerOptions,
   cli,
   defineAgent,
   llm,
   log,
+  initializeLogger,
   metrics,
   voice,
 } from '@livekit/agents';
@@
   entry: async (ctx: JobContext) => {
+    initializeLogger({ pretty: true });
     const agent = new voice.Agent({

As per coding guidelines: Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'.

🤖 Fix all issues with AI agents
In `@agents/src/voice/agent_activity.ts`:
- Around line 411-429: updateOptions currently sets turnDetectionMode and
isDefaultInterruptionByAudioActivityEnabled but doesn't sync the live runtime
flag isInterruptionByAudioActivityEnabled immediately, so switching to 'manual'
can leave interruptions enabled; update updateOptions to (1) set
this.turnDetectionMode and this.isDefaultInterruptionByAudioActivityEnabled as
you do, (2) if not currently speaking (use the existing speaking state or method
on the agent), set this.isInterruptionByAudioActivityEnabled to the new default
immediately, and (3) if an AudioRecognition instance or similar exists (or
this.realtimeSession manages audio), call its updateOptions/updateTurnDetection
path (e.g., this.realtimeSession.updateOptions(...) already used for toolChoice)
so runtime audio recognition honors the new turnDetection mode. Ensure you
reference updateOptions, turnDetectionMode,
isDefaultInterruptionByAudioActivityEnabled,
isInterruptionByAudioActivityEnabled, realtimeSession, and AudioRecognition (or
the audio component) when making the changes.

In `@agents/src/voice/agent_session.ts`:
- Around line 73-109: The doc for SessionOptions.userAwayTimeout is inconsistent
with its type and runtime behavior; update the documentation (or the type) so
they match: either make userAwayTimeout type number | null | undefined (so
omission or null disables it) or change the doc to state that setting
userAwayTimeout to null disables the feature (and that omission in Partial<T>
also disables it), and ensure the `@defaultValue` remains accurate; locate
SessionOptions and update the JSDoc for userAwayTimeout accordingly.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0682f25 and a820521.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (5)
  • agents/package.json
  • agents/src/voice/agent.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
  • examples/src/basic_agent.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • agents/package.json
  • agents/src/voice/agent.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • examples/src/basic_agent.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/agent_activity.ts
🧠 Learnings (2)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example) : Initialize logger before using any LLM functionality with `initializeLogger({ pretty: true })` from 'livekit/agents'

Applied to files:

  • examples/src/basic_agent.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example|spec) : When testing inference LLM, always use full model names from `agents/src/inference/models.ts` (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Applied to files:

  • examples/src/basic_agent.ts
  • agents/src/voice/agent_session.ts
🔇 Additional comments (10)
agents/src/voice/agent_activity.ts (1)

11-12: Interruption wiring across the speech lifecycle looks consistent.
The added hooks and state toggles appear coherent and well-integrated.

Also applies to: 91-113, 215-223, 315-317, 375-378, 585-747, 765-771, 919-929, 1313-1390, 1539-1542, 1666-1709, 2529-2531

examples/src/basic_agent.ts (1)

61-74: Turn-handling config placement and interruption event hooks look good.

Also applies to: 93-99

agents/src/voice/agent_session.ts (8)

18-18: LGTM: interruption/turn-handling imports are clear and well scoped.

Also applies to: 66-71


111-124: LGTM: internal options, defaults, and legacy alias are clear.


138-139: LGTM: callback surface now exposes interruption events.


148-155: LGTM: SessionOptions container + deprecated fields are clear.


166-166: LGTM: options now typed as InternalSessionOptions.


197-197: LGTM: interruptionDetection getter cleanly exposes the config.

Also applies to: 307-309


217-219: LGTM: user speaking span is tracked and cleaned up consistently.

Also applies to: 776-788, 918-920


220-258: migrateLegacyOptions correctly ensures fully-populated InternalSessionOptions.

The implementation confirms that defaults are always applied. defaultSessionOptions is unconditionally merged at line 56, and turnHandling is normalized via mergeWithDefaults() which applies defaults to turnDetection, endpointing, and interruption sub-configs. Even when callers omit options or use only legacy fields, the returned options field always contains a complete InternalSessionOptions with all properties populated. The existing tests verify this across multiple scenarios.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
agents/src/voice/agent.ts (1)

106-118: ⚠️ Potential issue | 🟠 Major

allowInterruptions from AgentOptions is not being used.

The AgentOptions interface defines allowInterruptions?: boolean (line 77), but it's not destructured in the constructor and thus never passed to migrateLegacyOptions. Per the context in turn_config/utils.ts, the migration function checks if (mergedOptions?.allowInterruptions === false) to set interruption mode, but this condition will never trigger since allowInterruptions isn't passed.

Users setting allowInterruptions: false when constructing an Agent will have this option silently ignored.

🐛 Proposed fix to include allowInterruptions
   constructor({
     id,
     instructions,
     chatCtx,
     tools,
     turnDetection,
     stt,
     vad,
     llm,
     tts,
     turnHandling,
+    allowInterruptions,
     useTtsAlignedTranscript,
   }: AgentOptions<UserData>) {

Then update the migration call:

-    const migratedOptions = migrateLegacyOptions({ turnDetection, options: { turnHandling } });
+    const migratedOptions = migrateLegacyOptions({ turnDetection, options: { turnHandling, allowInterruptions } });
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a820521 and 9cb0a29.

📒 Files selected for processing (2)
  • .changeset/silly-donkeys-shop.md
  • agents/src/voice/agent.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/voice/agent.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/voice/agent.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/voice/agent.ts
🧬 Code graph analysis (1)
agents/src/voice/agent.ts (4)
agents/src/voice/agent_session.ts (1)
  • TurnDetectionMode (126-126)
agents/src/voice/turn_config/turn_handling.ts (1)
  • TurnHandlingConfig (12-34)
agents/src/voice/turn_config/interruption.ts (1)
  • InterruptionConfig (7-40)
agents/src/voice/turn_config/utils.ts (1)
  • migrateLegacyOptions (14-71)
🔇 Additional comments (5)
.changeset/silly-donkeys-shop.md (1)

1-5: Changeset looks correct and aligned with the PR scope.

The version bump and summary description match the reported refactor and new barge-in support.

agents/src/voice/agent.ts (4)

34-36: LGTM!

The imports for the new turn-handling and interruption configuration types are appropriate and all are used in the file.


89-91: LGTM!

The private field declarations are appropriate. _interruptionDetection uses the correct type from InterruptionConfig['mode'] which already includes undefined as a valid value.


142-144: LGTM!

The migration logic correctly leverages migrateLegacyOptions to merge legacy turnDetection with the new turnHandling configuration. The initialization of interruption-related fields properly handles all mode values ('adaptive', 'vad', false, undefined) with correct boolean coercion.

Also applies to: 165-168


214-220: LGTM!

The new public getters for interruptionDetection and allowInterruptions are well-typed and correctly expose the internal state for external consumers.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@agents/src/voice/audio_recognition.ts`:
- Around line 895-901: inputStartedAt is being overwritten on every frame in the
input loop causing timestamp drift; set inputStartedAt once when a new stream
starts (or when the stream is restarted) instead of inside the per-frame loop.
Locate the loop that awaits inputReader.read() (symbols: signal.aborted,
abortPromise, inputReader.read, stream.pushFrame) and move the Date.now()
assignment so it runs once when the stream is initialized (or guard it with a
check like if (!this.inputStartedAt) this.inputStartedAt = Date.now()) and
ensure you explicitly reset inputStartedAt when the stream terminates/restarts
so future streams get a fresh base timestamp.
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9cb0a29 and ec26bb1.

📒 Files selected for processing (5)
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/agent.test.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
  • agents/src/voice/audio_recognition.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/voice/agent.test.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/voice/agent.test.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/voice/agent.test.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/audio_recognition.ts
  • agents/src/voice/agent_activity.ts
  • agents/src/voice/agent_session.ts
🧠 Learnings (5)
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example) : Initialize logger before using any LLM functionality with `initializeLogger({ pretty: true })` from 'livekit/agents'

Applied to files:

  • agents/src/voice/agent.test.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to examples/src/test_*.ts : For plugin component debugging (STT, TTS, LLM), create test example files prefixed with `test_` under the examples directory and run with `pnpm build && node ./examples/src/test_my_plugin.ts`

Applied to files:

  • agents/src/voice/agent.test.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/{examples,test}/**/*.test.ts : Include both basic streaming and tool calling tests to verify full LLM functionality

Applied to files:

  • agents/src/voice/agent.test.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx}?(test|example|spec) : When testing inference LLM, always use full model names from `agents/src/inference/models.ts` (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Applied to files:

  • agents/src/voice/agent.test.ts
  • agents/src/inference/interruption/interruption_stream.ts
  • agents/src/voice/agent_session.ts
📚 Learning: 2026-01-16T14:33:39.551Z
Learnt from: CR
Repo: livekit/agents-js PR: 0
File: .cursor/rules/agent-core.mdc:0-0
Timestamp: 2026-01-16T14:33:39.551Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Applied to files:

  • agents/src/inference/interruption/interruption_stream.ts
🧬 Code graph analysis (2)
agents/src/voice/agent.test.ts (1)
agents/src/log.ts (1)
  • initializeLogger (34-40)
agents/src/voice/agent_session.ts (6)
agents/src/voice/turn_config/turn_handling.ts (2)
  • TurnHandlingConfig (12-34)
  • InternalTurnHandlingConfig (36-39)
agents/src/voice/index.ts (2)
  • VoiceOptions (5-5)
  • AgentSessionOptions (5-5)
agents/src/inference/interruption/types.ts (1)
  • InterruptionEvent (17-28)
agents/src/voice/run_context.ts (1)
  • UnknownUserData (8-8)
agents/src/voice/turn_config/interruption.ts (1)
  • InterruptionConfig (7-40)
agents/src/voice/turn_config/utils.ts (1)
  • migrateLegacyOptions (14-71)
🔇 Additional comments (1)
agents/src/voice/agent.test.ts (1)

4-12: LGTM! Logger initialization follows project conventions.

The beforeAll hook correctly initializes the logger before tests run, aligning with the coding guideline to initialize logger before using any LLM functionality. The relative import path ../log.js is appropriate for a test file within the agents/src directory.

Minor note: level: 'info' is the default value per agents/src/log.ts, so it could be omitted, but being explicit is fine for test clarity.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@agents/src/voice/audio_recognition.ts`:
- Around line 881-889: The cleanup closure may call eventReader.releaseLock()
before eventReader is assigned, causing a TypeError; update cleanup (the async
function defined alongside abortHandler) to defensively check that eventReader
is defined (and optionally has releaseLock as a function) before calling
releaseLock, and likewise guard await stream.close() with a null/undefined
check; ensure you still removeEventListener('abort', abortHandler)
unconditionally but only call eventReader.releaseLock() and stream.close() when
those objects are non-null to avoid exceptions during early aborts.
- Around line 256-269: In flushHeldTranscripts(), avoid dropping held
transcripts when inputStartedAt is undefined: instead of clearing
transcriptBuffer and resetting ignoreUserTranscriptUntil when inputStartedAt is
falsy, retain the buffer and either log a warning or postpone flushing until
inputStartedAt is set; update the logic in flushHeldTranscripts (referencing
inputStartedAt, transcriptBuffer, ignoreUserTranscriptUntil) to preserve events
and emit a warning if onEndOfAgentSpeech triggers before any audio frames, or
queue a retry to flush once inputStartedAt becomes defined.
🧹 Nitpick comments (1)
agents/src/voice/audio_recognition.ts (1)

1035-1043: Consider adding an explicit return false for clarity.

The getter implicitly returns undefined for modes like 'manual' and 'stt'. While this works in boolean contexts, an explicit return false would make the intent clearer.

♻️ Suggested improvement
   private get vadBaseTurnDetection() {
     if (typeof this.turnDetectionMode === 'object') {
       return false;
     }
 
     if (this.turnDetectionMode === undefined || this.turnDetectionMode === 'vad') {
       return true;
     }
+    return false;
   }
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5c541f and 76bd4e8.

📒 Files selected for processing (1)
  • agents/src/voice/audio_recognition.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Add SPDX-FileCopyrightText and SPDX-License-Identifier headers to all newly added files with '// SPDX-FileCopyrightText: 2025 LiveKit, Inc.' and '// SPDX-License-Identifier: Apache-2.0'

Files:

  • agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example|spec)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

When testing inference LLM, always use full model names from agents/src/inference/models.ts (e.g., 'openai/gpt-4o-mini' instead of 'gpt-4o-mini')

Files:

  • agents/src/voice/audio_recognition.ts
**/*.{ts,tsx}?(test|example)

📄 CodeRabbit inference engine (.cursor/rules/agent-core.mdc)

Initialize logger before using any LLM functionality with initializeLogger({ pretty: true }) from '@livekit/agents'

Files:

  • agents/src/voice/audio_recognition.ts
🔇 Additional comments (12)
agents/src/voice/audio_recognition.ts (12)

1-71: LGTM!

The imports and interface extensions are well-structured. The RecognitionHooks extension with onInterruption and the new AudioRecognitionOptions fields follow the existing patterns in this file.


110-119: LGTM!

The new private fields for interruption state management are well-typed and appropriately scoped.


132-155: LGTM!

The conditional stream setup correctly handles both cases (with and without interruption detection). The 3-way tee for the interruption path and the fallback 2-way tee maintain backward compatibility.


174-197: LGTM!

The start() and stop() methods correctly manage the interruption task lifecycle. The early return in createInterruptionTask handles the case when interruption detection is not configured.


199-250: LGTM!

The agent speech lifecycle methods properly coordinate the interruption state transitions and sentinel dispatching. The use of Math.min for ignoreUserTranscriptUntil correctly handles overlapping end-of-speech events.


319-346: LGTM!

The shouldHoldSttEvent logic correctly handles edge cases: events without alternatives are held, and timestamp comparisons are guarded by inputStartedAt being defined. The initialization of inputStartedAt in createInterruptionTask provides a stable reference point.


348-366: LGTM!

The method has appropriate defensive checks and error handling. Returning a boolean for success/failure allows callers to react accordingly.


390-407: LGTM!

The interruption-aware STT event handling correctly buffers events during interruption states and flushes them when appropriate. The exemption for RECOGNITION_USAGE events is sensible.


579-583: LGTM!

The method cleanly dispatches interruption events to the registered hook.


892-906: LGTM!

The forwardTask correctly sets inputStartedAt once using nullish coalescing assignment, addressing the previous concern about timestamp drift. The async IIFE pattern with proper cleanup in finally is appropriate.


908-934: LGTM!

The event reading loop correctly uses Promise.race for cancellation support and has proper error handling that distinguishes between abort-induced and unexpected errors.


1001-1010: LGTM!

The close method properly cleans up interruption resources in the correct order: canceling the task before closing the channel.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

Comment on lines +256 to +269
private async flushHeldTranscripts() {
if (
!this.isInterruptionEnabled ||
this.ignoreUserTranscriptUntil === undefined ||
this.transcriptBuffer.length === 0
) {
return;
}

if (!this.inputStartedAt) {
this.transcriptBuffer = [];
this.ignoreUserTranscriptUntil = undefined;
return;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Transcripts may be dropped if inputStartedAt is not yet set when flush is called.

When inputStartedAt is undefined (no audio frames processed yet), the buffer is cleared and events are lost. This could happen if onEndOfAgentSpeech is called before any audio frames flow through the interruption task.

Consider holding the events instead of dropping them, or logging a warning when this edge case occurs.

🛠️ Suggested defensive logging
     if (!this.inputStartedAt) {
+      if (this.transcriptBuffer.length > 0) {
+        this.logger.warn(
+          `flushHeldTranscripts: dropping ${this.transcriptBuffer.length} events because inputStartedAt is not set`,
+        );
+      }
       this.transcriptBuffer = [];
       this.ignoreUserTranscriptUntil = undefined;
       return;
     }
🤖 Prompt for AI Agents
In `@agents/src/voice/audio_recognition.ts` around lines 256 - 269, In
flushHeldTranscripts(), avoid dropping held transcripts when inputStartedAt is
undefined: instead of clearing transcriptBuffer and resetting
ignoreUserTranscriptUntil when inputStartedAt is falsy, retain the buffer and
either log a warning or postpone flushing until inputStartedAt is set; update
the logic in flushHeldTranscripts (referencing inputStartedAt, transcriptBuffer,
ignoreUserTranscriptUntil) to preserve events and emit a warning if
onEndOfAgentSpeech triggers before any audio frames, or queue a retry to flush
once inputStartedAt becomes defined.

Comment on lines +881 to +889
const cleanup = async () => {
try {
signal.removeEventListener('abort', abortHandler);
eventReader.releaseLock();
await stream.close();
} catch (e) {
this.logger.debug('createInterruptionTask: error during abort handler:', e);
}
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential TypeError if cleanup is called before eventReader is assigned.

The cleanup function references eventReader at line 884, but eventReader is declared at line 909. If signal.aborted is already true when entering the function, or if abort fires during forwardTask before line 909 executes, eventReader will be undefined and releaseLock() will throw.

🛠️ Suggested fix
   const cleanup = async () => {
     try {
       signal.removeEventListener('abort', abortHandler);
-      eventReader.releaseLock();
+      eventReader?.releaseLock();
       await stream.close();
     } catch (e) {
       this.logger.debug('createInterruptionTask: error during abort handler:', e);
     }
   };
🤖 Prompt for AI Agents
In `@agents/src/voice/audio_recognition.ts` around lines 881 - 889, The cleanup
closure may call eventReader.releaseLock() before eventReader is assigned,
causing a TypeError; update cleanup (the async function defined alongside
abortHandler) to defensively check that eventReader is defined (and optionally
has releaseLock as a function) before calling releaseLock, and likewise guard
await stream.close() with a null/undefined check; ensure you still
removeEventListener('abort', abortHandler) unconditionally but only call
eventReader.releaseLock() and stream.close() when those objects are non-null to
avoid exceptions during early aborts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants