Skip to content

Use AudioEngine ADM for PlatformAudio on Apple platforms#1215

Draft
hiroshihorie wants to merge 4 commits into
hiroshi/adm-proxy-worker-threadfrom
hiroshi/audioengine-platformaudio
Draft

Use AudioEngine ADM for PlatformAudio on Apple platforms#1215
hiroshihorie wants to merge 4 commits into
hiroshi/adm-proxy-worker-threadfrom
hiroshi/audioengine-platformaudio

Conversation

@hiroshihorie

@hiroshihorie hiroshihorie commented Jul 2, 2026

Copy link
Copy Markdown
Member

Note

Stacked on #1212 (worker-thread-affine AdmProxy) — review only the top 3 commits. Retarget to main after #1212 merges.

What

Switches the platform ADM on iOS and macOS to the AVAudioEngine based AudioEngineDevice and plumbs its platform voice processing API through to PlatformAudio:

  • AdmProxy now creates the platform ADM with the kAppleAudioEngine audio layer on Apple platforms (kPlatformDefaultAudio elsewhere, Android unchanged). Creation stays in the proxy constructor on the worker thread, which the AudioEngine device binds its sequence checker to — the contract Make AdmProxy worker-thread-affine #1212 established.
  • New platform voice processing surface, forwarded through AdmProxyAudioDeviceControllerPeerConnectionFactoryLkRuntime: processing topology, voice processing path availability/toggle, and processing state.
  • configure_platform_audio_processing picks the strategy from the ADM's reported topology:
    • Coupled (Apple): AEC and NS share one voice processing path; it is enabled only when both are requested with hardware preferred, otherwise disabled with fallback to WebRTC software processing.
    • Independent (other platforms): builtin AEC/AGC/NS toggled individually, as before.
  • PlatformAudio::configure_audio_processing now makes a single native call instead of toggling builtin effects from Rust; docs updated from VPIO wording to the AudioEngine ADM.

Why

The AudioEngine ADM replaces the AudioUnit/CoreAudio-HAL default on Apple platforms, bringing runtime switchable voice processing (no engine rebuild to toggle AEC), proper device change handling, and parity with the Swift SDK's audio stack. Apple exposes AEC+NS as one coupled path, which the previous independent enable_builtin_* plumbing could not represent.

Dependencies — why CI is red

This PR cannot build against the currently pinned webrtc prebuilt (webrtc-51ef663): kAppleAudioEngine and the platform voice processing API only exist in newer webrtc. The chain to green:

  1. Expose C++ factory for the AudioEngine audio device module webrtc-sdk/webrtc#260 merges (exposes CreateAudioEngineDeviceModule / kAppleAudioEngine), ideally together with Fix macOS input device selection without voice processing in AudioEngineDevice webrtc-sdk/webrtc#261 (macOS device selection fix, see Testing below).
  2. A new prebuilt is built and published from it.
  3. A WEBRTC_TAG bump commit is added here (webrtc-sys/build/src/lib.rs).

Until then this is compile-verified against the webrtc-sdk source tree headers (clang -fsyntax-only for the touched C++) rather than by CI.

Testing

  • Draft until the prebuilt exists, then on macOS: cargo run -p livekit --example platform_audio (added in Make AdmProxy worker-thread-affine #1212) exercises the full control plane against the AudioEngine ADM unchanged — lifecycle, device enumeration/selection/hot-swap, recording, AEC/AGC/NS reconfiguration incl. the coupled voice processing path, concurrent access, and runtime teardown cycles.
  • Live-room check: confirm mic capture and speaker playout, and that configure_audio_processing(prefer_hardware_processing: true) activates the Apple voice processing path (GetPlatformAudioProcessingState).

E2E test results (local webrtc build via LK_CUSTOM_WEBRTC)

Verified on macOS arm64 against a local libwebrtc built from webrtc-sdk hiroshi/audioengine-platformaudio:

  • Compiles and links; AudioEngineDevice confirmed active (not a fallback): after configure_audio_processing(prefer_hardware_processing: true), builtin AEC/AGC/NS report available and active types flip to Hardware — the coupled Apple voice processing path enables end to end through AdmProxy on the worker thread.
  • platform_audio exerciser: lifecycle, enumeration, recording, processing reconfiguration, 16-thread concurrent hammering, and full runtime teardown cycles all pass.
  • Pre-existing AudioEngineDevice bug found by the exerciser (not caused by this PR): on macOS with voice processing disabled (prefer_hardware_processing: false, the macOS default), selecting a non-default recording device failed at InitRecording, because the engine's input and output nodes share one HAL I/O unit in that mode. Fixed in Fix macOS input device selection without voice processing in AudioEngineDevice webrtc-sdk/webrtc#261 via a private aggregate device; with that fix in the local build, device switching works in both processing modes and the full exerciser passes all phases against the AudioEngine ADM. The prebuilt for this PR should therefore include ffi v0.3.15 #261.

Select the new kAppleAudioEngine audio layer when creating the platform
ADM on iOS and macOS. The AVAudioEngine based ADM supports runtime
switchable voice processing and device change handling. Construction
happens in the proxy constructor on the worker thread, which the
AudioEngine device binds its sequence checker to.

Also forward the platform voice processing interface (topology, path
availability and toggle, processing state) to the platform ADM so the
coupled Apple AEC+NS path is reachable through the proxy.

Requires a libwebrtc build that includes CreateAudioEngineDeviceModule.
Apple's AudioEngine ADM exposes AEC and NS through one coupled voice
processing path rather than independent toggles. Expose the platform
audio processing topology and voice processing path controls through
AudioDeviceController, PeerConnectionFactory and LkRuntime, and add
configure_platform_audio_processing which picks the right strategy from
the reported topology:

- Coupled topology (Apple): enable the platform voice processing path
  only when both AEC and NS are requested with hardware preferred,
  otherwise disable it and fall back to WebRTC software processing.
- Independent topology: toggle builtin AEC/AGC/NS individually as
  before.

PlatformAudio::configure_audio_processing now makes a single native
call instead of toggling builtin effects from Rust, and the docs are
updated from VPIO wording to the AudioEngine ADM.
The AudioSourceInterface hunk stopped applying after the fork added
SetOptions right where the patch expected the class to end. The build
script tolerates patch failures (git apply || true), so building a
prebuilt from current m144 would have silently produced a libwebrtc
without external audio source support. Content is unchanged, only the
surrounding context is regenerated against m144.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant