Audio Support for Scope Rework by BuffMcBigHuge · Pull Request #534 · daydreamlive/scope

BuffMcBigHuge · 2026-02-24T20:21:20Z

Audio Support for Scope (Reworked)

Overall, this approach is simplified and seems to solve audio quality issues. The complex gating mechanics and resampling have been removed, leveraging WebRTC aiortc-style timing.

Summary

Adds end-to-end audio support to Scope's WebRTC streaming pipeline. Pipelines can return audio alongside video in their output dict; the server streams audio over WebRTC. This is a simplified rewrite of the audio path that fixes clipping and audio quality issues reported in PR #480.

What's New

Backend

Pipeline interface: Pipelines may return {"video": ..., "audio": ..., "audio_sample_rate": ...}. Audio keys are optional; pipelines that don't produce audio are unchanged.
PipelineProcessor: Uses an audio_callback instead of a queue. Only the last processor in a chain receives the callback.
FrameProcessor: Simple audio_queue for raw (audio_tensor, sample_rate) tuples. No background drain thread, no video-gated release, no resampling.
AudioProcessingTrack: Receives raw audio from FrameProcessor, resamples to 48 kHz when needed (per-channel, preserves stereo), buffers samples, and delivers 20ms stereo frames for WebRTC/Opus. Uses aiortc-style monotonic pacing.

Frontend

VideoOutput: Mute/unmute toggle (speaker icon). Starts muted to satisfy browser autoplay policy; user can unmute once the stream is playing.
useUnifiedWebRTC: Merges video and audio tracks into a single MediaStream. Adds a recvonly audio transceiver so the SDP offer includes an audio m-line for the backend to attach its track.

WebRTC Handshake

The browser adds addTransceiver("audio", { direction: "recvonly" }) so the offer includes an audio m-line. After setRemoteDescription, the backend finds the audio transceiver, attaches its AudioProcessingTrack, and sets direction to sendonly. The answer then indicates that the server will send audio.

Why This Version (audio-sync-2)

PR #480 received feedback about:

Audio quality / clipping – video and audio not synchronized, poor experience
Code complexity – video-gated release, drain thread, MediaClock coupling

This branch is a simplified rewrite that:

Keeps stereo – No forced mono mixdown. Pipelines (e.g. LTX-2) output stereo; we pass it through. Forced mono can cause phase cancellation and artifacts.
Passthrough when possible – If the pipeline outputs 48 kHz (WebRTC standard), resampling is skipped entirely.
Per-channel resampling – When resampling is needed, each channel is resampled separately, preserving stereo.
Simpler architecture – Callback → queue → track. No drain thread, no video-gated slicing, no MediaClock for audio.

Architecture

Pipeline.__call__() → {"video": tensor, "audio": tensor, "audio_sample_rate": int}
    │
    ▼
PipelineProcessor.audio_callback(audio_tensor, sample_rate)
    │
    ▼
FrameProcessor.audio_queue  (raw tuples)
    │
    ▼
AudioProcessingTrack.recv()  (resample if needed, buffer, 20ms frames)
    │
    ▼
WebRTC

Trade-offs

Item	Status
NDI audio	Not included in this PR. NDI output sinks do not receive audio. Can be re-added in a follow-up.
A/V sync	Previous video-gated approach removed. Uses independent aiortc-style pacing. Please test sync behavior.

Files Changed

src/scope/server/frame_processor.py – Simplified audio path (~185 net lines removed)
src/scope/server/pipeline_processor.py – Callback-based audio delivery
src/scope/server/tracks.py – Stereo AudioProcessingTrack with per-channel resampling
src/scope/server/webrtc.py – Audio track wiring (no MediaClock)

Video and audio are being paced differently. Video effectively uses wall-clock while audio is using sample counts. Is there a reason for this?
Using wall-clock makes things susceptible to pipeline jitter.
If audio output is a little delayed, it doesn't get a chance for another 20ms. This accumulates and will lead to desync. Conversely, audio that might be a little bursty can be unnecessarily delayed.
Is there a reason the audio queue is non-blocking? It seems preferable to block (up to a reasonable duration, then silence can be inserted) instead of sleeping. But maybe I'm missing something about why media pulls are intended to be non-blocking.
In general, I'd consider using input timestamps or a "reference clock" to timestamp and pace the output, rather depending on wall-clock after the pipeline. Pipelines may produce output at different rates and this architecture generally doesn't account for that. More on this in Discord.

BuffMcBigHuge · 2026-02-25T00:45:24Z

Thank you @j0sh, let's continue in Discord.

j0sh · 2026-02-25T04:35:56Z

src/scope/server/tracks.py

+            # Interleave into buffer: [L0, R0, L1, R1, ...]
+            for i in range(audio_np.shape[1]):
+                for ch in range(self.channels):
+                    self._audio_buffer.append(audio_np[ch, i])
+
+        # Serve a 20ms frame from the buffer
+        samples_needed = self._samples_per_frame * self.channels
+        if len(self._audio_buffer) >= samples_needed:
+            samples = [self._audio_buffer.popleft() for _ in range(samples_needed)]


Do these loop over individual samples? That is probably pretty expensive; best to work on complete 20ms frames if possible. There's probably some numpy wizardry for fast interleaving and effective frame chunking.

i attempted to address this here #633

github-actions · 2026-03-02T16:19:16Z

🚀 fal.ai Preview Deployment


App ID	`daydream/scope-pr-534`
WebSocket	`wss://fal.run/daydream/scope-pr-534/ws`
Commit	`acf091d`

Testing

Connect to this preview deployment by setting the fal endpoint in your client:

FAL_WS_URL=wss://fal.run/daydream/scope-pr-534/ws

🧪 E2E tests will run automatically against this deployment.

github-actions · 2026-03-02T16:22:24Z

✅ E2E Tests passed


Status	passed
fal App	`daydream/scope-pr-534`
Run	View logs

Test Artifacts

Check the workflow run for screenshots.

…ioProcessingTrack Addresses review feedback on #534. The audio buffer interleaving and frame extraction used O(n) Python loops over individual samples, which is expensive for real-time audio. Now uses np.ravel(order="F") for interleaving and numpy slicing for frame extraction. Also adds 42 tests covering interleaving, buffering, resampling, channel conversion, frame construction, and adversarial inputs. Signed-off-by: RyanOnTheInside <7623207+ryanontheinside@users.noreply.github.com>

…perf Fix AudioProcessingTrack per-sample loop performance

coderabbitai · 2026-03-09T20:09:00Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47c7a2c6-bba3-4b02-9005-f61d81e11853

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch marco/feat/audio-sync-2

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

BuffMcBigHuge added 17 commits February 17, 2026 17:39

Audio with NDI, audio buffer in frame loop, added audio track and med…

b96e00e

…ia clock in webrtc. Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Frontend audio work.

ad8f5d0

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Import fixes.

c1b098a

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Modified order of operations for audio track.

ef3aff2

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Modification to audio handshake.

32c8833

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Solving issues with audio handshake.

acd30d5

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Audio support testing and logging.

9f9662f

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Fighting with audio connection handshake issue.

fb35011

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Merge branch 'main' into marco/feat/audio.

65d65f7

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Mediaclock rework.

073a6a2

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Added video/audio gating for syncing.

6a9eec6

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Removed gating, added frame rate handling for smooth playback with au…

d0c80bc

…dio. Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Attempt at polyphase audio version with gating.

8f92de4

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Simplified audio sampling, added native fps handler.

ebf1392

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Simplified re-write of audio with no resampling, just passthrough

9bba416

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Removed unused code.

630f988

Signed-off-by: BuffMcBigHuge <marco@bymar.co>

Merge branch 'main' into marco/feat/audio-sync-2

db175c1

This was referenced Feb 24, 2026

Audio support, controls, performance. daydreamlive/scope-ltx-2#3

Open

Audio Support for Scope #480

Closed

j0sh reviewed Feb 24, 2026

View reviewed changes

j0sh reviewed Feb 25, 2026

View reviewed changes

Merge branch 'main' into marco/feat/audio-sync-2

acf091d

ryanontheinside mentioned this pull request Mar 9, 2026

Fix AudioProcessingTrack per-sample loop performance #633

Merged

3 tasks

Merge pull request #633 from daydreamlive/ryan/fix/audio-track-numpy-…

2c445ba

…perf Fix AudioProcessingTrack per-sample loop performance

ryanontheinside mentioned this pull request Mar 9, 2026

feat: audio-only pipeline support, decouple FrameProcessor from VideoProcessingTrack #639

Draft

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio Support for Scope Rework#534

Audio Support for Scope Rework#534
BuffMcBigHuge wants to merge 20 commits intomainfrom
marco/feat/audio-sync-2

BuffMcBigHuge commented Feb 24, 2026

Uh oh!

j0sh left a comment

Uh oh!

BuffMcBigHuge commented Feb 25, 2026

Uh oh!

j0sh Feb 25, 2026

Uh oh!

ryanontheinside Mar 9, 2026

Uh oh!

github-actions bot commented Mar 2, 2026

Uh oh!

github-actions bot commented Mar 2, 2026

Uh oh!

coderabbitai bot commented Mar 9, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

BuffMcBigHuge commented Feb 24, 2026

Audio Support for Scope (Reworked)

Summary

What's New

Backend

Frontend

WebRTC Handshake

Why This Version (audio-sync-2)

Architecture

Trade-offs

Files Changed

Related

Uh oh!

j0sh left a comment

Choose a reason for hiding this comment

Uh oh!

BuffMcBigHuge commented Feb 25, 2026

Uh oh!

j0sh Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

ryanontheinside Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 2, 2026

🚀 fal.ai Preview Deployment

Testing

Uh oh!

github-actions bot commented Mar 2, 2026

✅ E2E Tests passed

Test Artifacts

Uh oh!

coderabbitai bot commented Mar 9, 2026

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants