Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients #3821

adamdougal · 2026-02-10T19:55:19Z

Add BaseRealtimeClient abstract base and RealtimeClientProtocol with connect, disconnect, send_audio, send_text, send_tool_result, update_session, and events methods
Add AzureOpenAIRealtimeClient using the OpenAI SDK beta realtime API for Azure OpenAI
Add OpenAIRealtimeClient using the OpenAI SDK GA realtime API
Add AzureVoiceLiveClient using the Azure Voice Live SDK (azure-ai-voicelive)
Add RealtimeAgent for high-level voice agent orchestration with tool execution
Add RealtimeSessionConfig, RealtimeEvent, and related dataclass types
Add update_session() for changing session config (instructions, tools, voice) without reconnecting
Add public tool_to_schema() and execute_tool() helper functions
Add multi-agent sample demonstrating agent transfers over a single connection
Add microphone, tools, FastAPI WebSocket, and WebSocket client samples
Add agent-framework-azure-voice-live package
Add comprehensive test coverage for all three clients and the realtime agent

Fixes #728

Motivation and Context

Agent Framework currently supports text-based chat clients but has no support for realtime bidirectional voice streaming. Several LLM providers now offer WebSocket-based realtime APIs (OpenAI Realtime, Azure OpenAI Realtime, Azure Voice Live) that enable natural voice conversations with function calling, VAD, and barge-in support. This PR adds first-class support for these APIs so developers can build voice agents using the same patterns they already use for chat agents.

Description

Protocol & Base (packages/core)

RealtimeClientProtocol defines the interface all realtime clients must implement. BaseRealtimeClient provides shared logic including session config translation, a convenience as_agent() factory, and serialization support. RealtimeSessionConfig and RealtimeEvent are simple dataclasses that normalize configuration and events across providers.

Client Implementations

OpenAIRealtimeClient — connects via openai.OpenAI().realtime (GA API). Translates framework events to/from the OpenAI event format. Reads credentials from OPENAI_API_KEY.
AzureOpenAIRealtimeClient — connects via openai.AzureOpenAI().realtime (beta API). Supports both API key and azure-identity credential auth. Reads settings from AzureOpenAISettings.
AzureVoiceLiveClient — connects via the azure-ai-voicelive SDK using typed model objects (UserMessageItem, InputTextContentPart, etc.). Packaged as the separate agent-framework-azure-voice-live package since it depends on the Voice Live SDK.

All three clients implement update_session() which allows changing instructions, tools, and (where supported) voice on an active connection without dropping the conversation. OpenAI and Azure OpenAI reject voice changes once assistant audio exists; AzureVoiceLiveClient supports voice changes at any time — this limitation is documented in the method docstrings.

RealtimeAgent

RealtimeAgent wraps a BaseRealtimeClient and adds automatic tool execution. Given an audio stream, it connects the client, forwards audio, dispatches tool calls, sends results, and yields normalized RealtimeEvent objects. Public tool_to_schema() and execute_tool() functions are exported for use in custom orchestration (e.g., the multi-agent sample).

Samples (samples/getting_started/realtime/)

Sample	What it demonstrates
`realtime_with_microphone.py`	Basic voice conversation with mic/speaker
`realtime_with_tools.py`	Voice conversation with `@tool` function calling
`realtime_with_multiple_agents.py`	Multiple agents (greeter, support, assistant) transferring via a single connection using `update_session()`
`realtime_fastapi_websocket.py`	WebSocket API server for browser clients
`websocket_audio_client.py`	CLI client for the FastAPI endpoint
`audio_utils.py`	Shared mic capture and speaker playback utilities

All samples support --client-type (or REALTIME_CLIENT_TYPE env var) to switch between openai, azure_openai, and azure_voice_live.

Tests

Unit tests for all three client implementations with mocked provider SDKs
Tests for RealtimeAgent tool dispatch, event forwarding, and error handling
Tests for RealtimeSessionConfig and RealtimeEvent types
Tests for AzureVoiceLiveSettings configuration
Tests for Azure OpenAI realtime settings integration

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? No — all additions are new APIs.

markwallace-microsoft · 2026-02-10T19:58:10Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework
_realtime_agent.py	101	6	94%	83–84, 170, 176–177, 220
_realtime_client.py	56	1	98%	118
_realtime_types.py	16	0	100%
packages/core/agent_framework/azure
_realtime_client.py	121	70	42%	154–155, 168, 226–232, 240–241, 253–254, 262, 270–271, 279, 287–288, 290–293, 304, 306–312, 319–320, 327–338, 347–348, 355–360, 367–373, 377–381, 385–386, 388–389
_shared.py	81	3	96%	144–145, 236
packages/core/agent_framework/openai
_realtime_client.py	124	54	56%	91–92, 251–252, 254–257, 268, 270–276, 283–284, 291–302, 311–312, 319–324, 331–337, 341–345, 349–350, 352–353
TOTAL	17206	2235	87%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
4136	225 💤	0 ❌	0 🔥	1m 12s ⏱️

adamdougal · 2026-02-10T20:01:54Z

Hello!

Appreciate this is a large PR so not expecting a detailed review quickly. However, I'd love an indiciation of how likely this is to be merged and if so what the timelines are.

I've based the implementation on the draft ADR here.

Also, it looks like the samples check is failing, is it not possible to add samples in this way until the framework code is merged?

Thanks

Copilot

Pull request overview

Adds first-class realtime bidirectional voice support to the Python Agent Framework, including normalized realtime types, a base/protocol abstraction, three provider clients (OpenAI, Azure OpenAI, Azure Voice Live), a high-level RealtimeAgent, plus samples and tests.

Changes:

Introduces realtime core abstractions (RealtimeClientProtocol, BaseRealtimeClient, RealtimeEvent, RealtimeSessionConfig) and RealtimeAgent orchestration.
Adds provider implementations: OpenAIRealtimeClient, AzureOpenAIRealtimeClient, and new package agent-framework-azure-voice-live (Voice Live SDK).
Adds realtime getting-started samples (mic, tools, multi-agent transfer, FastAPI websocket bridge, websocket client) and accompanying tests.

Reviewed changes

Copilot reviewed 32 out of 35 changed files in this pull request and generated 20 comments.

Show a summary per file

File	Description
python/uv.lock	Adds workspace package + Voice Live SDK dependency lock entries.
python/pyproject.toml	Registers `agent-framework-azure-voice-live` as a workspace member dependency.
python/samples/getting_started/realtime/init.py	Marks realtime samples folder as a package.
python/samples/getting_started/realtime/README.md	Documentation for running realtime voice samples and configuration.
python/samples/getting_started/realtime/audio_utils.py	Shared mic capture / speaker playback utilities for samples.
python/samples/getting_started/realtime/realtime_fastapi_websocket.py	FastAPI WebSocket bridge sample for browser/clients.
python/samples/getting_started/realtime/realtime_with_microphone.py	Basic mic-to-realtime-agent sample.
python/samples/getting_started/realtime/realtime_with_multiple_agents.py	Demonstrates agent transfer via `update_session()` on one connection.
python/samples/getting_started/realtime/realtime_with_tools.py	Demonstrates realtime tool-calling during voice conversation.
python/samples/getting_started/realtime/websocket_audio_client.py	CLI websocket client sample that streams mic audio and plays responses.
python/packages/core/agent_framework/init.py	Exports realtime APIs from the top-level `agent_framework` package.
python/packages/core/agent_framework/_realtime_agent.py	Implements `RealtimeAgent`, plus `tool_to_schema()` / `execute_tool()` helpers.
python/packages/core/agent_framework/_realtime_client.py	Defines `RealtimeClientProtocol` and `BaseRealtimeClient`.
python/packages/core/agent_framework/_realtime_types.py	Adds normalized dataclasses for session config and events.
python/packages/core/agent_framework/openai/init.py	Exposes `OpenAIRealtimeClient` from the OpenAI module.
python/packages/core/agent_framework/openai/_realtime_client.py	OpenAI GA realtime implementation + event normalization.
python/packages/core/agent_framework/azure/init.py	Adds lazy exports for Azure realtime + Voice Live package types.
python/packages/core/agent_framework/azure/_realtime_client.py	Azure OpenAI realtime implementation + event normalization.
python/packages/core/agent_framework/azure/_shared.py	Adds `realtime_deployment_name` to `AzureOpenAISettings`.
python/packages/core/tests/core/test_realtime_agent.py	Validates tool execution, event forwarding, and transcript/thread behavior.
python/packages/core/tests/core/test_realtime_client.py	Protocol/base tests and config translation tests.
python/packages/core/tests/core/test_realtime_types.py	Tests for realtime dataclasses and exports.
python/packages/core/tests/openai/test_openai_realtime_client.py	Tests OpenAI realtime client behavior with mocked SDK.
python/packages/core/tests/azure/test_azure_realtime_client.py	Tests Azure OpenAI realtime client behavior with mocked SDK.
python/packages/core/tests/azure/test_azure_openai_settings_realtime.py	Tests new `realtime_deployment_name` setting integration.
python/packages/azure-ai-voice-live/pyproject.toml	New distributable package for Voice Live integration.
python/packages/azure-ai-voice-live/README.md	Usage docs for the Voice Live integration package.
python/packages/azure-ai-voice-live/LICENSE	Package license.
python/packages/azure-ai-voice-live/agent_framework_azure_voice_live/init.py	Package exports and version discovery.
python/packages/azure-ai-voice-live/agent_framework_azure_voice_live/_client.py	Voice Live realtime client implementation.
python/packages/azure-ai-voice-live/agent_framework_azure_voice_live/_settings.py	Settings model for Voice Live client.
python/packages/azure-ai-voice-live/agent_framework_azure_voice_live/py.typed	Marks package as typed.
python/packages/azure-ai-voice-live/tests/test_client.py	Unit tests for Voice Live client normalization + connect flows.
python/packages/azure-ai-voice-live/tests/test_settings.py	Unit tests for Voice Live settings env/constructor behavior.
python/packages/azure-ai-voice-live/tests/init.py	Test package marker.

python/packages/core/agent_framework/openai/_realtime_client.py

python/packages/core/agent_framework/azure/_realtime_client.py

python/samples/getting_started/realtime/realtime_with_tools.py

python/packages/core/tests/core/test_realtime_agent.py

python/packages/core/agent_framework/_realtime_agent.py

python/samples/getting_started/realtime/realtime_fastapi_websocket.py

python/samples/getting_started/realtime/realtime_with_multiple_agents.py

python/samples/getting_started/realtime/websocket_audio_client.py

…and Voice Live clients - Add BaseRealtimeClient protocol with connect, disconnect, send_audio, send_text, send_tool_result, update_session, and events methods - Add AzureOpenAIRealtimeClient using the OpenAI SDK beta realtime API - Add OpenAIRealtimeClient using the OpenAI SDK GA realtime API - Add AzureVoiceLiveClient using the Azure Voice Live SDK - Add RealtimeAgent for high-level voice agent orchestration - Add RealtimeSessionConfig, RealtimeEvent, and related types - Add update_session() for changing session config without reconnecting - Add public tool_to_schema() and execute_tool() helper functions - Add multi-agent sample demonstrating agent transfers via single connection - Add single-agent and bidirectional audio samples - Add comprehensive test coverage for all clients and agent

- Set _connected = True after successful connect() - Set _connected = False at the start of disconnect() - Applied to both OpenAIRealtimeClient and AzureOpenAIRealtimeClient - AzureVoiceLiveClient already tracked this correctly

- Clear _pending_function_names in disconnect() for both OpenAI and Azure OpenAI clients to avoid leaking per-connection state across reconnects

- Two remaining tools (get_weather, get_time) are sufficient - Removes eval()-based code that was a copy/paste risk

- Collect all messages in a single list instead of grouping by role - Store messages in transcript order (user, assistant, user, assistant) - Update test assertions to match chronological ordering

- Removes eval()-based calculate tool (same as realtime_with_tools) - Two remaining tools (get_weather, get_time) are sufficient

…owth - Set maxsize=100 on asyncio.Queue to apply backpressure - Prevents memory growth if client sends audio faster than consumed

- Thread stored _api_version through to vl_connect() - The SDK supports it; previously the setting was accepted but ignored - Update existing connect tests to assert default api_version - Add test for custom api_version forwarding

- Add input_transcript, response_transcript, and tool_result - Reflects the full normalized event surface produced by clients

- Non-FunctionTool entries now return an explicit error message - Previously fell back to str(tool), sending misleading results to model - Add test covering non-FunctionTool execution path

- Rename to tool_name in tool_call branch to avoid shadowing builtin - Remove unused name assignment in tool_result branch

- Use single 'from contextlib import' instead of mixing import styles - Replace contextlib.suppress with suppress throughout

- Add logger to _realtime_agent.py using project get_logger convention - Log cancellation at debug level instead of silent pass

- Add logging to realtime_fastapi_websocket, realtime_with_multiple_agents, and websocket_audio_client samples - Log cancellation/disconnect at debug level instead of silent pass

- Declare external dependencies (pyaudio, fastapi, uvicorn, websockets) - Makes samples self-contained and runnable with uv run - Follows SAMPLE_GUIDELINES.md conventions

- Use get("id") instead of ["id"] to avoid KeyError - Log error and emit error event when id is missing - Skip send_tool_result when no id is available

…atMessage → Message) - Replace ToolProtocol with FunctionTool in _realtime_agent.py and _realtime_client.py - Replace ChatMessage with Message in _realtime_agent.py

Copilot

Pull request overview

Copilot reviewed 32 out of 35 changed files in this pull request and generated 3 comments.

python/samples/getting_started/realtime/realtime_with_tools.py

python/packages/azure-ai-voice-live/README.md

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

- Use indented single-line format consistent with realtime_with_multiple_agents

Copilot AI review requested due to automatic review settings February 10, 2026 19:55

markwallace-microsoft added documentation Improvements or additions to documentation python labels Feb 10, 2026

adamdougal changed the title ~~feat(realtime): add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients~~ Python: add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients Feb 10, 2026

adamdougal changed the title ~~Python: add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients~~ Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients Feb 10, 2026

github-actions bot changed the title ~~Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients~~ Python: feat(realtime): add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients Feb 10, 2026

Copilot started reviewing on behalf of adamdougal February 10, 2026 19:56 View session

adamdougal changed the title ~~Python: feat(realtime): add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients~~ Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients Feb 10, 2026

Copilot AI reviewed Feb 10, 2026

View reviewed changes

adamdougal added 17 commits February 11, 2026 10:46

fix(realtime): clear pending function names on disconnect

ff7c2cf

- Clear _pending_function_names in disconnect() for both OpenAI and Azure OpenAI clients to avoid leaking per-connection state across reconnects

chore: remove calculate tool from realtime_with_tools sample

d41bffd

- Two remaining tools (get_weather, get_time) are sufficient - Removes eval()-based code that was a copy/paste risk

fix: preserve chronological message order in RealtimeAgent

6adf02a

- Collect all messages in a single list instead of grouping by role - Store messages in transcript order (user, assistant, user, assistant) - Update test assertions to match chronological ordering

chore: remove calculate tool from multi-agent realtime sample

fb56e1c

- Removes eval()-based calculate tool (same as realtime_with_tools) - Two remaining tools (get_weather, get_time) are sufficient

fix: bound audio queue in VoiceSession to prevent unbounded memory gr…

3270ffd

…owth - Set maxsize=100 on asyncio.Queue to apply backpressure - Prevents memory growth if client sends audio faster than consumed

fix: pass api_version to Voice Live SDK connect call

9e3c917

- Thread stored _api_version through to vl_connect() - The SDK supports it; previously the setting was accepted but ignored - Update existing connect tests to assert default api_version - Add test for custom api_version forwarding

test: add missing event types to realtime types test

0b07834

- Add input_transcript, response_transcript, and tool_result - Reflects the full normalized event surface produced by clients

fix: return clear error for non-FunctionTool in realtime execute_tool

084bb43

- Non-FunctionTool entries now return an explicit error message - Previously fell back to str(tool), sending misleading results to model - Add test covering non-FunctionTool execution path

fix: remove unnecessary name assignment in websocket_audio_client

3acd603

- Rename to tool_name in tool_call branch to avoid shadowing builtin - Remove unused name assignment in tool_result branch

fix: consolidate contextlib imports in fastapi websocket sample

b9c51e9

- Use single 'from contextlib import' instead of mixing import styles - Replace contextlib.suppress with suppress throughout

fix: replace bare pass comment with debug log in _send_audio_loop

4487ae4

- Add logger to _realtime_agent.py using project get_logger convention - Log cancellation at debug level instead of silent pass

fix: replace bare except pass with debug logging in realtime samples

633c490

- Add logging to realtime_fastapi_websocket, realtime_with_multiple_agents, and websocket_audio_client samples - Log cancellation/disconnect at debug level instead of silent pass

chore: add PEP 723 inline script metadata to realtime samples

07a8603

- Declare external dependencies (pyaudio, fastapi, uvicorn, websockets) - Makes samples self-contained and runnable with uv run - Follows SAMPLE_GUIDELINES.md conventions

fix: handle missing id in tool_call events gracefully

d75fc6a

- Use get("id") instead of ["id"] to avoid KeyError - Log error and emit error event when id is missing - Skip send_tool_result when no id is available

fix: update imports after main merge (ToolProtocol → FunctionTool, Ch…

c6700a6

…atMessage → Message) - Replace ToolProtocol with FunctionTool in _realtime_agent.py and _realtime_client.py - Replace ChatMessage with Message in _realtime_agent.py

adamdougal force-pushed the adam/realtime branch from f5cafd1 to c6700a6 Compare February 11, 2026 10:51

adamdougal requested a review from Copilot February 11, 2026 11:05

Copilot started reviewing on behalf of adamdougal February 11, 2026 11:06 View session

Copilot AI reviewed Feb 11, 2026

View reviewed changes

python/samples/getting_started/realtime/realtime_with_tools.py Outdated Show resolved Hide resolved

python/samples/getting_started/realtime/realtime_with_tools.py Outdated Show resolved Hide resolved

python/packages/azure-ai-voice-live/README.md Outdated Show resolved Hide resolved

adamdougal and others added 3 commits February 11, 2026 11:13

Fix samples link for realtime

fe36453

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Remove mention of maths questions

f1f9eed

fix: align tool call output formatting in realtime_with_tools sample

88f0ed6

- Use indented single-line format consistent with realtime_with_multiple_agents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients #3821

Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients #3821

adamdougal commented Feb 10, 2026 •

edited

Loading

Uh oh!

markwallace-microsoft commented Feb 10, 2026 •

edited

Loading

Uh oh!

adamdougal commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients #3821

Are you sure you want to change the base?

Python: Add realtime voice agents with OpenAI, Azure OpenAI and Voice Live clients #3821

Conversation

adamdougal commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

Description

Contribution Checklist

Uh oh!

markwallace-microsoft commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

adamdougal commented Feb 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

adamdougal commented Feb 10, 2026 •

edited

Loading

markwallace-microsoft commented Feb 10, 2026 •

edited

Loading