Skip to content

feat(webrtc): add WebRTC transport scaffolding (spec compliance + tests)#1309

Merged
acul71 merged 43 commits into
libp2p:mainfrom
yashksaini-coder:webrtc
Jun 28, 2026
Merged

feat(webrtc): add WebRTC transport scaffolding (spec compliance + tests)#1309
acul71 merged 43 commits into
libp2p:mainfrom
yashksaini-coder:webrtc

Conversation

@yashksaini-coder

@yashksaini-coder yashksaini-coder commented Apr 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds experimental WebRTC transport scaffolding (libp2p.transport.webrtc) per the libp2p WebRTC and WebRTC Direct specs. Enabled via the optional libp2p[webrtc] extra (aiortc) and the enable_webrtc opt-in.

This is a v1 node-to-node foundation, not a production-ready WebRTC stack.

Refs #546.

What works (node-to-node)

  • WebRTCDirectTransport (/webrtc-direct) and WebRTCPrivateTransport (/webrtc via Circuit Relay v2).
  • Spec-compliant data-channel framing — uvarint length-prefixed protobuf, full FIN / FIN_ACK / STOP_SENDING / RESET state machine.
  • In-band data channels for application streams; the Noise channel (id=0) remains pre-negotiated per spec.
  • Signaling with bilateral ICE_DONE (webrtc: race condition during connection negotiation specs#585 fix).
  • Noise XX prologue binding the handshake to DTLS certificate fingerprints.
  • SDP builder with an isolated _apply_ice_credentials() seam for WebRTC-Direct: Chrome may be removing ICE credential munging specs#672.
  • ECDSA P-256 cert generation with multihash / multibase fingerprint encoding; DTLS cert pinned to RTCPeerConnection via aiortc's name-mangled private slot so the advertised /certhash/ matches the actual handshake.
  • Trio ↔ asyncio bridge for aiortc.

Out of scope for this PR

  • Browser interop is explicitly NOT supported. v1 SDP munging on the browser side is being phased out by Chrome's WebRTC-NoSdpMangleUfrag field trial; browser dial will land on v2 (docs: webrtc-direct-v2 specs#715).
  • Interop with go-libp2p / js-libp2p WebRTC Direct dialers. Their listener reconstructs the offer from the inbound STUN USERNAME; ours currently uses an HTTP POST /sdp exchange, documented as a py-to-py temporary harness.
  • Private /webrtc dial — only the listener / signaling skeleton lands here.
  • Full inbound handler wiring (Noise + handler invocation) on the WebRTCDirectListener — pending an aiortc STUN USERNAME exposure spike (sub-issue of Add Support for WebRTC set-up in py-libp2p #546, to be filed).

Recent additions (review push)

Following the maintainer review the following critical-path fixes landed in 4 commits on top of the existing scaffolding:

  1. feat(webrtc): spec-compliant uvarint length-prefixed data-channel framing — each data-channel message is now uvarint-prefixed protobuf within the 16 KiB SCTP cap. Without this, the wire format was incompatible with js / go-libp2p (bare protobuf, no length prefix; full-payload sends exceeded the spec ceiling).
  2. fix(webrtc): in-band data channels with DCEP open wait — application streams now use in-band channels so the peer's datachannel event fires (previously accept_stream() blocked forever because pre-negotiated channels skip that event); writes await SCTP DCEP open before send, with a 30s timeout instead of an indefinite hang.
  3. fix(webrtc): pin DTLS certificate via aiortc's mangled private slot — pre-push code review caught that the obvious pc._certificates = [...] is a silent no-op (aiortc reads only self.__certificates, which mangles to pc._RTCPeerConnection__certificates). Pinning the mangled slot makes the advertised /certhash/ match the actual DTLS handshake; a canary test now asserts the pinned cert's fingerprint lands in the SDP.
  4. test+docs(webrtc): two-PC loopback + experimental v1 newsfragment — real RTCPeerConnectionRTCPeerConnection echo test exercising framing, multiplexed streams, and the SDP-fingerprint canary; newsfragment reframed to make the experimental v1 scope explicit.

Testing

153 / 153 tests passing, including:

  • tests/core/transport/webrtc/test_multiplexing_loopback.py — real two-RTCPeerConnection ICE → DCEP → multiplexed stream → > 16 KiB round-trip echo (both directions), concurrent streams, and a SDP-fingerprint canary.
  • Full unit coverage across the framing, signaling, bridge, certificate, multiaddr, SDP, Noise prologue, connection, and stream layers.

References

Wire format for WebRTC data-channel messages per the libp2p spec:
Message { Flag flag, bytes message } with FIN/STOP_SENDING/RESET/FIN_ACK.

Constants include protocol codes (0x0118, 0x0119, 0x01D2), message size
limits, data-channel ID allocation rules, Noise prologue prefix, and
ICE/DTLS defaults matching go-libp2p.
ECDSA P-256 self-signed certificate generation with SHA-256 fingerprint
encoding as multihash/multibase for /webrtc-direct/certhash/<hash> multiaddrs.

Registers webrtc-direct (0x0118), webrtc (0x0119) and certhash (0x01D2)
protocol codes in py-multiaddr's registry at import time.

Adds aiortc>=1.5.0 as an optional dependency under [project.optional-dependencies].
Runs a single asyncio event loop in a background daemon thread. Trio
tasks schedule aiortc coroutines via run_coro() which uses
trio.to_thread.run_sync with abandon_on_cancel=True for proper
cancellation propagation.

Thread-safe state management via threading.Lock, clean shutdown with
task cancellation, and fire-and-forget scheduling for event callbacks.
SDPBuilder constructs SDP offer/answer from multiaddr components for
WebRTC Direct. All ICE credential injection isolated in
_apply_ice_credentials() as a single seam for specs#672 (Chrome ICE
credential munging deprecation).

WebRTCTransportConfig dataclass with ICE timeouts, connection limits,
and optional certificate field matching go-libp2p defaults.
WebRTCStream implements IMuxedStream over a single data channel with
protobuf framing and the FIN/FIN_ACK/STOP_SENDING/RESET state machine.
Data enqueued before flags to handle FIN+payload messages correctly.

WebRTCConnection implements both IRawConnection and IMuxedConn (same
dual-interface pattern as QUICConnection). Even channel IDs for outbound
streams starting at 2, odd for inbound starting at 1, ID 0 reserved
for Noise handshake. Atomic ID allocation under threading.Lock.
Noise XX over data channel 0 with prologue constructed from both peers'
DTLS certificate fingerprints: b'libp2p-webrtc-noise:' + mh(local) + mh(remote).

Adds prologue parameter to PatternXX and BasePattern.create_noise_state()
so the Noise session is cryptographically bound to the DTLS transport.
DataChannelReadWriter wraps send/recv callbacks as IRawConnection for
the existing Noise handshake machinery.
WebRTCDirectTransport implements ITransport with provides_native_muxing=True.
Dial parses /webrtc-direct multiaddr, constructs SDP from certhash, and
creates WebRTCConnection. Bridge initialization is concurrency-safe via
trio.Lock.

WebRTCDirectListener binds UDP and publishes multiaddr with certhash and
peer ID. Package __init__.py exports the full public API.
Replace 4 isinstance(QUICTransport) checks in swarm.py with
transport.provides_native_muxing property. Add provides_native_muxing
to ITransport ABC (default False), override True in QUICTransport.

Register webrtc-direct and webrtc transports in TransportRegistry
(lazy-loaded, ImportError-safe for when aiortc is not installed).
Implements /webrtc-signaling/0.0.1 for SDP and ICE candidate exchange
over Circuit Relay v2 streams. Bilateral ICE_DONE mechanism (specs#585)
ensures neither side closes the signaling stream before the other has
received all candidates.

WebRTCPrivateTransport handles /p2p-circuit/webrtc/ multiaddrs.
WebRTCPrivateListener registers a stream handler for incoming signaling.
Covers protobuf Message round-trip with all 4 flags, FIN=0 wire
presence (optional field regression guard), certificate ECDSA P-256
generation, SHA-256 fingerprint multihash/multibase encoding round-trip,
and multiaddr detection/parsing/building for webrtc-direct and webrtc.
Bridge tests: lifecycle (start/stop/restart), concurrent coro execution
(100 rapid-fire), error propagation across trio-asyncio boundary,
cancellation with abandon_on_cancel, fire-and-forget, stress cycles.

Stream tests: protobuf framing, write chunking at 16KiB, FIN/FIN_ACK/
STOP_SENDING/RESET flag handling, data-before-flag ordering, deadline
support, channel close events.
Connection tests: dual-interface properties, outbound ID allocation
(even, starting at 2), stream limit enforcement, accept queue, message
routing, close/reset lifecycle.

SDP tests: offer/answer construction, fingerprint extraction, IPv4/IPv6,
multiaddr-based SDP generation.

Noise tests: prologue construction with multihash-encoded fingerprints,
asymmetry verification, DataChannelReadWriter callbacks.

Signaling tests: varint encoding round-trip, message serialization for
all 4 types, SignalingSession offer/answer/candidate exchange, bilateral
ICE_DONE completion protocol.
Copilot AI review requested due to automatic review settings April 13, 2026 16:54

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds initial WebRTC transport scaffolding to py-libp2p, including WebRTC Direct (/webrtc-direct) and relay-signaled WebRTC (/webrtc), plus supporting utilities (certs, SDP, signaling, Noise prologue) and tests. It also generalizes swarm behavior for transports that provide native multiplexing.

Changes:

  • Introduces libp2p.transport.webrtc package (bridge, transports/listeners, connection/stream abstractions, SDP/cert utilities, signaling protocol, protobuf framing).
  • Updates swarm + transport interfaces to support “native-muxing transports” via provides_native_muxing.
  • Adds a comprehensive unit test suite for the new WebRTC modules and registers optional WebRTC transports in the transport registry.

Reviewed changes

Copilot reviewed 38 out of 41 changed files in this pull request and generated 22 comments.

Show a summary per file
File Description
tests/core/transport/webrtc/init.py Adds test package for WebRTC transport.
tests/core/transport/webrtc/test_asyncio_bridge.py Tests trio↔asyncio bridge lifecycle, concurrency, cancellation, stress.
tests/core/transport/webrtc/test_certificate.py Tests DTLS cert generation and certhash encoding/decoding.
tests/core/transport/webrtc/test_connection.py Tests WebRTCConnection stream mgmt + lifecycle.
tests/core/transport/webrtc/test_multiaddr_utils.py Tests /webrtc-direct and /webrtc multiaddr helpers.
tests/core/transport/webrtc/test_noise_handshake.py Tests Noise prologue construction and data-channel read/writer wrapper.
tests/core/transport/webrtc/test_protobuf.py Tests WebRTC framing protobuf serialization invariants.
tests/core/transport/webrtc/test_sdp.py Tests SDP building and fingerprint extraction.
tests/core/transport/webrtc/test_signaling.py Tests signaling varint framing and message/session flow.
tests/core/transport/webrtc/test_stream.py Tests protobuf framing + FIN/FIN_ACK/STOP_SENDING/RESET stream lifecycle.
pyproject.toml Adds webrtc optional dependency extra for aiortc.
newsfragments/546.feature.rst Towncrier fragment describing the new WebRTC scaffolding.
libp2p/transport/webrtc/init.py Exposes WebRTC constants/exceptions/cert utilities as package API.
libp2p/transport/webrtc/_asyncio_bridge.py Background asyncio loop thread and Trio API for running asyncio coroutines.
libp2p/transport/webrtc/certificate.py P-256 self-signed cert generation + multihash/multibase fingerprint encoding.
libp2p/transport/webrtc/config.py WebRTC transport config (timeouts, limits, ICE servers, cert).
libp2p/transport/webrtc/connection.py IRawConnection + IMuxedConn WebRTC connection abstraction and stream registry.
libp2p/transport/webrtc/constants.py WebRTC protocol codes/IDs and framing + ID allocation constants.
libp2p/transport/webrtc/exceptions.py WebRTC exception hierarchy integrated with swarm dial errors.
libp2p/transport/webrtc/listener.py WebRTC Direct listener skeleton + multiaddr advertisement.
libp2p/transport/webrtc/multiaddr_utils.py Registers multiaddr protocol codes and parses/builds WebRTC multiaddrs.
libp2p/transport/webrtc/noise_handshake.py Noise XX over data-channel-0 wrapper + prologue construction.
libp2p/transport/webrtc/pb/init.py WebRTC framing protobuf package init.
libp2p/transport/webrtc/pb/webrtc.proto Protobuf schema for WebRTC data-channel framing.
libp2p/transport/webrtc/pb/webrtc_pb2.py Generated protobuf code for WebRTC framing.
libp2p/transport/webrtc/pb/webrtc_pb2.pyi Typing stubs for generated WebRTC framing protobuf.
libp2p/transport/webrtc/private_listener.py Private-to-private signaling listener skeleton via relay stream handler.
libp2p/transport/webrtc/private_transport.py Private-to-private /webrtc transport skeleton + bridge bootstrapping.
libp2p/transport/webrtc/sdp.py SDP builder and fingerprint extraction utilities.
libp2p/transport/webrtc/signaling.py Varint-length-prefixed signaling protocol + ICE_DONE session logic.
libp2p/transport/webrtc/signaling_pb/init.py Signaling protobuf package init.
libp2p/transport/webrtc/signaling_pb/signaling.proto Protobuf schema for signaling messages.
libp2p/transport/webrtc/signaling_pb/signaling_pb2.py Generated protobuf code for signaling messages.
libp2p/transport/webrtc/signaling_pb/signaling_pb2.pyi Typing stubs for generated signaling protobuf.
libp2p/transport/webrtc/stream.py WebRTC stream abstraction with protobuf framing + FIN/RESET state machine.
libp2p/transport/webrtc/transport.py WebRTC Direct /webrtc-direct transport skeleton.
libp2p/transport/transport_registry.py Registers WebRTC transports (lazy import) alongside existing transports.
libp2p/transport/quic/transport.py Marks QUIC transport as providing native muxing.
libp2p/security/noise/patterns.py Adds optional Noise prologue support and threads it through PatternXX.
libp2p/network/swarm.py Generalizes QUIC-specific logic to provides_native_muxing.
libp2p/abc.py Adds provides_native_muxing: bool = False to ITransport.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread libp2p/transport/webrtc/sdp.py Outdated
Comment on lines +17 to +25
import hashlib
import logging
import secrets

from .certificate import WebRTCCertificate, fingerprint_from_multibase
from .exceptions import WebRTCConnectionError

logger = logging.getLogger(__name__)

Comment thread libp2p/transport/webrtc/connection.py Outdated
from .config import WebRTCTransportConfig
from .constants import (
ACCEPT_QUEUE_SIZE,
NOISE_HANDSHAKE_CHANNEL_ID,
Comment on lines +92 to +96
maddr = build_webrtc_direct_multiaddr(
"127.0.0.1", 9090, certhash, peer_id="12D3KooWDpJ7As7BWAwRMfu1VU2WCqNjvq387JEYKDBj4kx6nXTN"
)
assert "/p2p/12D3KooWDpJ7As7BWAwRMfu1VU2WCqNjvq387JEYKDBj4kx6nXTN" in str(maddr)

Comment thread libp2p/transport/webrtc/transport.py Outdated
Comment on lines +117 to +140
# Create the connection object
conn = WebRTCConnection(
peer_id=remote_peer_id or ID(b"\x00"), # Will be set after handshake
bridge=bridge,
is_initiator=True,
config=self._config,
remote_addrs=[maddr],
)

# NOTE: The actual RTCPeerConnection creation, SDP exchange, ICE
# negotiation, and Noise handshake would happen here when aiortc is
# wired up. For now, we create the connection object with the
# correct structure so the swarm integration (Phase 3) can be tested.
#
# The full dial sequence is:
# 1. Create RTCPeerConnection via bridge
# 2. Set local SDP offer
# 3. Construct remote SDP from multiaddr certhash
# 4. Wait for ICE connection
# 5. Perform Noise XX handshake over data channel 0
# 6. Verify remote peer identity
# 7. Call conn.start()

return conn
Comment on lines +38 to +47
from ._asyncio_bridge import AsyncioBridge
from .certificate import WebRTCCertificate
from .config import WebRTCTransportConfig
from .connection import WebRTCConnection
from .constants import WEBRTC_SIGNALING_PROTOCOL_ID
from .exceptions import WebRTCConnectionError, WebRTCSignalingError
from .multiaddr_utils import is_webrtc_multiaddr
from .private_listener import WebRTCPrivateListener
from .sdp import SDPBuilder

Comment on lines +63 to +74
was_locked = _mp.REGISTRY.locked
if was_locked:
_mp.REGISTRY._locked = False
try:
for proto in _PROTOCOLS_TO_REGISTER:
try:
_mp.REGISTRY.add(proto)
except Exception:
pass # Already registered or conflict — skip
finally:
if was_locked:
_mp.REGISTRY._locked = True
Comment thread libp2p/transport/webrtc/certificate.py Outdated
For SHA-256 both code (0x12) and length (32) fit in a single byte,
so we avoid a full varint encoder.
"""
return struct.pack("BB", _SHA256_MULTIHASH_CODE, _SHA256_DIGEST_SIZE) + self._fingerprint
Comment on lines +7 to +12
import struct
from unittest.mock import AsyncMock

import pytest
import trio

Comment on lines +7 to +10
from unittest.mock import AsyncMock, MagicMock

import pytest
import trio
Comment on lines +5 to +19
# Protobuf Python Version: 6.31.1
"""Generated protocol buffer code."""
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import runtime_version as _runtime_version
from google.protobuf import symbol_database as _symbol_database
from google.protobuf.internal import builder as _builder
_runtime_version.ValidateProtobufRuntimeVersion(
_runtime_version.Domain.PUBLIC,
6,
31,
1,
'',
'libp2p/transport/webrtc/signaling_pb/signaling.proto'
)
@yashksaini-coder yashksaini-coder marked this pull request as draft April 13, 2026 17:30
Previously dial() built a WebRTCConnection without an RTCPeerConnection
or a completed Noise handshake.  With provides_native_muxing=True the
swarm treats the result as a live, established connection — peers would
appear connected while every stream silently drops data.

Validate the multiaddr (so the error shape is consistent once wiring
lands), then raise NotImplementedError with a pointer to PR libp2p#1309 scope.

Also acquire _bridge_lock in close() so a concurrent dial cannot race
the bridge teardown.
on_data() and on_datachannel() are documented as callable from the
asyncio bridge thread but were calling trio.MemorySendChannel.send_nowait
and trio.Event.set directly — undocumented thread-unsafe operations
that would crash or corrupt internal state once aiortc is wired.

Capture trio.lowlevel.current_trio_token() at construction on the trio
side and use trio.from_thread.run_sync() with that token to perform
trio mutations from foreign threads.  WebRTCConnection now propagates
its captured token to inbound streams (created from the asyncio thread)
so they don't fall back to the unguarded inline path.

Also fix _schedule_send: previously it called the trio-facing
_send_callback (which awaits bridge.run_coro), which would deadlock the
asyncio thread on a future scheduled on the same thread.  Bypass the
trio wrapper and invoke _send_on_channel_cb directly via
schedule_fire_and_forget.
- sdp.fingerprint_from_sdp(): validate the parsed digest is exactly 32
  bytes (SHA-256) so callers never feed a malformed value into the
  Noise prologue
- signaling.read_signaling_message(): replace O(n^2) bytes-concat with
  a bytearray for linear-time message assembly
- multiaddr_utils._ensure_protocols_registered(): guard the
  py-multiaddr private `_locked` mutation with hasattr() and log a
  warning if the registry shape changes in a future release; use the
  public `add` API for the actual insertion
- transport_registry: gate WebRTC registration on importlib.util.find_spec
  for aiortc instead of catching ImportError on modules that don't
  actually import aiortc themselves; route /webrtc and /webrtc-direct
  multiaddrs through create_transport_for_multiaddr; teach
  create_transport about the WebRTC private_key requirement
- _asyncio_bridge: type-annotate Future as concurrent.futures.Future
  (the actual return type from asyncio.run_coroutine_threadsafe);
  silence the trio.to_thread.run_sync stub that doesn't yet declare
  abandon_on_cancel; split the long __repr__ for E501
- noise_handshake: use ISecureConn.get_remote_peer() instead of poking
  the .remote_peer attribute (private to the concrete SecureSession);
  parameterise list type and import ConnectionType + Multiaddr
- certificate, config, constants, listener, private_listener: ruff
  format cleanups, line-length fixes, import organisation
- test_protobuf, test_stream, test_noise_handshake, test_signaling:
  drop unused trio / pytest / struct / MagicMock imports (Ruff F401)
- test_certificate, test_multiaddr_utils, test_connection: extract
  long literals into named locals to satisfy E501 and silence F841
  for intentionally-discarded open_stream() returns
- ruff format pass across all WebRTC test modules
@yashksaini-coder

Copy link
Copy Markdown
Contributor Author

Thanks for the thorough review. All 22 Copilot findings addressed plus 4 additional issues from a follow-up senior pass. Pushed in 5 focused commits.

Critical fixes

  • dial() no longer returns a fake live connection. Both WebRTCDirectTransport and WebRTCPrivateTransport now raise NotImplementedError after multiaddr validation. Previously the swarm would treat the returned connection as established and silently drop data. (commit 69e8479f)
  • Cross-thread Trio safety. WebRTCStream.__init__ now accepts a trio_token parameter. WebRTCConnection captures current_trio_token() at construction and propagates it to inbound streams created in on_datachannel() (which runs on the asyncio bridge thread once aiortc is wired). All Trio mutations from foreign threads now route through trio.from_thread.run_sync(..., trio_token=...). (commit f2352656)
  • _schedule_send deadlock fix. The previous version invoked the trio-facing _send_callback via bridge.schedule_fire_and_forget. That callback awaits bridge.run_coro, which would block the asyncio thread on a future scheduled on the same thread. It now bypasses the trio wrapper and invokes the asyncio-native _send_on_channel_cb directly. (commit f2352656)
  • fingerprint_from_sdp length validation. Rejects anything other than exactly 32 bytes so callers never feed a malformed digest into the Noise prologue. (commit b97c6cc3)
  • WebRTC registration actually gates on aiortc. The previous try/except ImportError block didn't gate on anything because none of the registered modules import aiortc themselves. Now uses importlib.util.find_spec("aiortc"). (commit b97c6cc3)

Other fixes

  • signaling.read_signaling_message switched from O(n²) bytes += chunk to a bytearray for linear assembly
  • _ensure_protocols_registered guards the py-multiaddr _locked mutation with hasattr and logs gracefully if the internal attribute changes
  • create_transport_for_multiaddr and create_transport now route /webrtc-direct and /webrtc correctly with the required private_key argument
  • Both transports' close() now acquire _bridge_lock to prevent races with concurrent dial()
  • All 22 unused-import / unused-local / line-length findings fixed; ruff check and ruff format --check clean
  • mypy clean (21 source files)
  • Protobuf regenerated with the toolchain that produced the most recent repo files (circuit_pb2.py)

Verification

  • ruff check: passing
  • ruff format --check: passing
  • mypy: 0 errors
  • pytest tests/core/transport/webrtc/: 145 / 145 passing
  • Regression: pytest tests/core/security/noise/ tests/core/transport/test_tcp.py: 144 / 144 passing

Ready for re-review when you have a moment.

@acul71

acul71 commented Apr 14, 2026

Copy link
Copy Markdown
Contributor

Hello @yashksaini-coder thanks for this PR.
Are you doing this from scratch, or taking (at least) the utilities from previous Neha PR ?

Full review here:

PR #1309 Review - WebRTC Transport Scaffolding

1. Summary of Changes

  • Adds a large new libp2p.transport.webrtc subsystem (direct + relayed WebRTC scaffolding, signaling, SDP/cert helpers, Noise binding, stream/connection abstractions, protobuf wire format, and tests).
  • Updates transport/swarm abstractions to support native-mux transports via ITransport.provides_native_muxing, and applies this to QUIC/WebRTC.
  • References issue #546 and includes newsfragments/546.feature.rst.
  • Scope is intentionally "structural skeleton" (no live aiortc peer-connection wiring yet), with explicit NotImplementedError guards in dial paths.

2. Branch Sync Status and Merge Conflicts

Branch Sync Status

  • Status: ℹ️ Ahead of origin/main
  • Details: 0 20 (branch_sync_status.txt) => branch is 0 behind / 20 ahead.

Merge Conflict Analysis

  • No conflicts detected (merge_conflict_check.log shows "Already up to date" and no unmerged files).

3. Strengths

  • Good architectural decomposition for a large transport addition (bridge, transport, signaling, stream, cert, SDP split).
  • Tests are extensive in the new WebRTC test tree and include protocol/state-machine scenarios.
  • Issue linkage and newsfragment requirement are satisfied (#546, 546.feature.rst).
  • PR description clearly documents phased delivery and non-goals.

4. Issues Found

Critical

  • File: libp2p/network/swarm.py
  • Line(s): 947-957
  • Issue: Native-mux stream open path re-raises raw exceptions instead of converting to SwarmException and skips fallback behavior. This breaks existing contract expectations in callers/tests and is observable as a regression in make test (test_failed_second_open_does_not_release_first_stream_rm_slot fails with raw Exception: connection broken).
  • Suggestion: Mirror the non-native branch behavior: catch Exception as e, attempt alternative connections where applicable, then raise SwarmException(...) from e.

libp2p/network/swarm.py - lines 947-957

        if (
            getattr(self.transport, "provides_native_muxing", False)
            and connection is not None
        ):
            conn = cast("SwarmConn", connection)
            try:
                stream = await conn.new_stream()
                logger.debug("successfully opened a stream to peer %s", peer_id)
                return stream
            except Exception:
                raise

Major

  • File: libp2p/transport/webrtc/*, tests/core/transport/webrtc/*

  • Line(s): multiple (see typecheck_output.log)

  • Issue: make typecheck fails (pyrefly-local) with many errors (54 shown), including production code and tests (_asyncio_bridge.py, certificate.py, stream.py, and multiple test modules).

  • Suggestion: Resolve pyrefly typing issues before merge; this is currently a CI blocker.

  • File: docs/libp2p.transport.webrtc.rst / docs toctree wiring

  • Line(s): warning from docs build output

  • Issue: make linux-docs fails due toc.not_included warning: document is not included in any toctree.

  • Suggestion: Include WebRTC docs page in the appropriate docs toctree (or adjust generation/exclusions) so docs build passes with -W.

Minor

  • File: libp2p/network/swarm.py
  • Line(s): around 665 and 947 (reported in lint output)
  • Issue: long-line style violations were reported by ruff in lint run.
  • Suggestion: Re-run formatting/lint hooks on the final branch state and commit resulting formatting changes.

5. Security Review

  • No direct cryptographic misuse or obvious input-validation vulnerability stood out in reviewed changes.
  • Security-sensitive areas (Noise prologue construction, cert fingerprint handling, signaling framing) are accompanied by tests.
  • Residual risk: medium implementation risk remains until live aiortc wiring lands, because thread/async boundary code is complex and currently only partially integrated.

6. Documentation and Examples

  • User-facing docs are partially covered via newsfragment and module docs.
  • make linux-docs currently fails due to toctree inclusion warning, so documentation integration is incomplete for merge.

7. Newsfragment Requirement

  • ✅ Issue reference present in PR body: Refs #546.
  • ✅ Newsfragment present: newsfragments/546.feature.rst.
  • No blocker here.

8. Tests and Validation

  • make lint: failed (hooks modified files; pyrefly also failed as part of lint pipeline).
  • make typecheck: failed (pyrefly errors).
  • make test: failed with 1 failure:
    • tests/core/network/test_stream_semaphore.py::test_failed_second_open_does_not_release_first_stream_rm_slot
    • Failure trace shows raw Exception("connection broken") propagating from swarm._open_stream_on_connection.
  • make linux-docs: failed due warning treated as error (docs/libp2p.transport.webrtc.rst not in any toctree).

9. Recommendations for Improvement

  • Fix _open_stream_on_connection native-mux error handling to preserve existing SwarmException behavior and fallback semantics.
  • Make pyrefly/typecheck green across both new WebRTC modules and tests.
  • Fix docs toctree integration for WebRTC docs page.
  • Re-run full CI-equivalent checks and update PR with exact pass summary.

10. Questions for the Author

  • Was the native-mux branch in _open_stream_on_connection intentionally changed to propagate raw exceptions?
  • Should native-mux transports attempt alternative connections on stream-open failure, similar to existing branch behavior?
  • Can you confirm pyrefly pass results in the same environment as CI for this exact commit SHA?

11. Overall Assessment

  • Quality Rating: Needs Work
  • Security Impact: Low
  • Merge Readiness: Needs fixes
  • Confidence: High (for identified blockers), Medium (for full behavioral surface of new transport scaffolding)

1. Swarm _open_stream_on_connection regression: the native-mux branch
   was re-raising raw exceptions instead of wrapping in SwarmException
   with fallback to alternative connections. Unified both paths into a
   single try/except that tries alternatives on failure and raises
   SwarmException — restoring the contract expected by
   test_failed_second_open_does_not_release_first_stream_rm_slot.

2. pyrefly typecheck: added pyrefly: ignore directives to test files
   that use mock objects not assignable to strict ABCs (MockStream vs
   INetStream, IMuxedStream.channel_id). Production code fixes:
   lambda wrapper for loop.stop in call_soon_threadsafe, assert-narrowing
   for thread join, max(0.0, float) instead of max(0, float).

3. Docs toctree: added libp2p.transport.webrtc to the transport docs
   toctree in docs/libp2p.transport.rst so sphinx-build -W passes.

4. Removed unused QUICTransport import from swarm.py (ruff F401) and
   split long log message (E501).
Ruff UP035: Callable, Awaitable, Coroutine, AsyncIterator are
available from collections.abc since Python 3.9. Importing from
typing is deprecated. This aligns with the import style used
throughout the rest of the codebase.
Generated by sphinx-apidoc. Includes module-level documentation for
all webrtc submodules (certificate, config, connection, stream, sdp,
signaling, transport, listener) and sub-packages (pb, signaling_pb).

Already wired into docs/libp2p.transport.rst toctree in the previous
commit so sphinx-build -W passes cleanly.
- pyproject.toml: exclude webrtc test files and two production modules
  (_asyncio_bridge.py, certificate.py) from pyrefly checks. The errors
  are third-party stub limitations (asyncio.call_soon_threadsafe
  callback signature, cryptography NameAttribute TypeVar) that pyrefly
  cannot suppress with inline comments.
- _asyncio_bridge.py: bind loop to a local variable before the lambda
  so pyrefly's narrowing sees a non-None type.
- certificate.py: ruff format wrap of the pyrefly ignore comment.

All pre-commit hooks now pass: ruff, ruff-format, mypy, pyrefly,
pyupgrade, yaml, toml, mdformat, rst-check, path-audit.
Lint: add type: ignore[misc,untyped-decorator] to the two remaining
aiortc @pc.on() decorators that mypy flags as untyped (connectionstatechange
and datachannel handlers in _aiortc_helpers.py).

Docs: add aiortc and its submodules to autodoc_mock_imports in
docs/conf.py so ReadTheDocs can build without libsrtp2-dev installed.
@acul71

acul71 commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

@yashksaini-coder @sumanjeet0012
Thanks Yash for this PR, you fullfilled my review requests.
Although other issues have to be addressed.
About WebRTC CI's pyrefly exclusion we have to decide if keep excluding them and mark WebRTC as experimental, if it lands in a py-libp2p release.

Fullreview here:

AI PR Review: #1309

Reviewed branch: webrtc @ 7df4db99d1d84673882cfaef962e4c86f3aac26c (matches PR head at review time)
Prompt: downloads/py-libp2p-pr-review-prompt.md
Date: 2026-04-17


1. Summary of Changes

This pull request adds a WebRTC transport package (libp2p.transport.webrtc) covering WebRTC Direct (/webrtc-direct) and scaffolding for relay-signaled WebRTC (/webrtc), including:

  • Data-channel protobuf framing, FIN / FIN_ACK / STOP_SENDING / RESET handling, signaling messages (including bilateral ICE_DONE), SDP helpers, certificates and multiaddr helpers, Noise XX with a DTLS fingerprint prologue, and a trio ↔ asyncio bridge for aiortc.
  • Core integration: WebRTCDirectTransport.dial() / WebRTCDirectListener now drive RTCPeerConnection via the bridge, with a minimal HTTP POST /sdp path for SDP exchange in the loopback-oriented setup; WebRTCPrivateTransport.dial() still raises NotImplementedError (private path not wired to aiortc yet).
  • Swarm / ABC: ITransport.provides_native_muxing (default False; QUIC and WebRTC set True) and swarm._open_stream_on_connection unified so native-mux transports get the same fallback + SwarmException behavior as other transports (fixes the regression called out by acul71).
  • Noise: optional prologue on PatternXX / BasePattern.create_noise_state.
  • Tooling / DX: optional libp2p[webrtc] extra, enable_webrtc on new_swarm / new_host, WebRTC docs and interop harness updates (interop/transport).

Issue tracking: The PR body uses Refs #546. Issue #546 asks for a functional, interoperable WebRTC transport, signaling/NAT, tests, and clear docs/examples. This PR substantially advances #546 (especially Direct + tests + docs) but does not fully satisfy interoperability and end-user examples as stated in the issue’s acceptance criteria (expected for a phased rollout).

Breaking / API note: New optional attribute on ITransport is backward-compatible for subclasses that inherit the default; custom transport implementations outside the repo should be aware of the new field.


2. Branch Sync Status and Merge Conflicts

Branch sync status

  • Source: downloads/AI-PR-REVIEWS/1309/branch_sync_status.txt
  • Raw: 0 32 from git rev-list --left-right --count origin/main...HEAD
  • Interpretation: 0 commits behind origin/main, 32 commits aheadahead only (feature branch contains work not in main).

Merge conflict analysis

  • Source: downloads/AI-PR-REVIEWS/1309/merge_conflict_check.log
  • Result: Test merge reported “Already up to date.” with MERGE_EXIT_CODE=0 and “=== NO MERGE CONFLICTS DETECTED ===”.
✅ **No merge conflicts detected.** The PR branch can be merged cleanly into origin/main at the time of this check.

3. Strengths

  • Spec-aligned structure: Framing, signaling (ICE_DONE), Noise prologue binding, and SDP credential seam (_apply_ice_credentials) show deliberate alignment with libp2p WebRTC specs and known spec issues (e.g. specs#585, specs#672).
  • Concurrency and safety: Trio/asyncio boundary fixes (e.g. trio.from_thread.run_sync with a captured token, NotImplementedError when a “live” muxed connection would have been a foot-gun, bridge locking on teardown) reflect serious review iteration.
  • Regression fix in swarm: Native-mux stream open now shares fallback behavior and SwarmException wrapping with the non-native path (see libp2p/network/swarm.py around the unified try / except in _open_stream_on_connection).
  • Test breadth: Large dedicated suite under tests/core/transport/webrtc/ plus integration coverage where aiortc is available; full suite green locally (see §8).
  • Optional dependency hygiene: Transport registration is gated on importlib.util.find_spec("aiortc"); transport.py avoids top-level aiortc imports and pulls _aiortc_helpers inside methods.
  • Maintainer loop: acul71’s review items (swarm regression, pyrefly/mypy, docs toctree, listener typing, callback return shape, etc.) appear addressed in subsequent commits and in the current tree.

4. Issues Found

Critical

None identified on the reviewed HEAD with lint, typecheck, full tests, and docs all passing locally (see §8).

Major

  • File: newsfragments/546.feature.rst
    Line(s): 1 (entire fragment)
    Issue: The newsfragment still states that RTCPeerConnection wiring is stubbed and deferred to a subsequent PR. On this branch, WebRTC Direct is wired to aiortc; only parts of the roadmap (e.g. private transport dial, broader interop) remain incomplete. User-facing release notes would be misleading if merged as-is.
    Suggestion: Update the fragment to describe what users get today (e.g. experimental Direct path behind libp2p[webrtc] / enable_webrtc, HTTP SDP helper for the current harness) and what remains out of scope for this merge.

newsfragments/546.feature.rst — lines 1–1

Added WebRTC transport scaffolding (``libp2p.transport.webrtc``) per the libp2p WebRTC and WebRTC Direct specs. [...] The underlying ``RTCPeerConnection`` wiring is stubbed with ``NOTE`` comments and will follow in a subsequent PR.
  • File: pyproject.toml ([tool.pyrefly] excludes)
    Line(s): ~310–324
    Issue: Several production WebRTC modules (_asyncio_bridge.py, _aiortc_helpers.py, certificate.py, transport.py, listener.py) and the whole tests/core/transport/webrtc tree are excluded from pyrefly. This unblocks CI but reduces static guarantees on the highest-risk new code.
    Suggestion: Treat excludes as temporary: file follow-up issues to narrow ignores to known third-party stub gaps (or add stubs) and re-enable pyrefly on modules that can be typed accurately.

  • File: GitHub PR description (.github / web UI, not in repo)
    Issue: The PR body still describes “PR 1 of 2” and stubbed dial() / listener loops. The branch now includes Phase A–F-style commits (aiortc wiring, enable_webrtc, interop Dockerfile/ping harness). Reviewers relying only on the opening post will mis-scope the change.
    Suggestion: Edit the PR description to match current scope and non-goals (or add a short “Update (date): …” preamble).

Minor

  • File: libp2p/transport/webrtc/listener.py
    Line(s): ~74
    Issue: listen(self, maddr, nursery: object = None) with # type: ignore[override] papers over IListener.listen’s nursery contract instead of using the nursery for structured concurrency (long-term maintainability).
    Suggestion: Plan a follow-up to spawn signaling / per-connection work under the provided Nursery where the swarm API supplies it, and drop the broad override ignore when possible.

  • File: tests/core/transport/webrtc/test_connection.py, test_stream.py
    Issue: RuntimeWarning: coroutine ... was never awaited from AsyncMock (see test_output.log warnings summary).
    Suggestion: Prefer MagicMock for non-async callbacks or AsyncMock with awaited async paths so warnings do not mask real async bugs.


5. Security Review

  • Risk: Minimal HTTP SDP handler (run_signaling_server) trusts Content-Length and reads readexactly(content_length) without an explicit upper bound on body size.
    Impact: Medium for availability (memory / slowloris style abuse on any exposed listener port), Low for confidentiality on this narrow endpoint.
    Mitigation: Cap content_length to a small maximum SDP size, reject non-POST/non-/sdp,` add total handler timeout, and add tests for oversized / malformed requests.

libp2p/transport/webrtc/_aiortc_helpers.py — lines 261–284

            content_length = int(headers.get("content-length", "0"))
            body = b""
            if content_length > 0:
                body = await asyncio.wait_for(
                    reader.readexactly(content_length),
                    timeout=_SDP_HTTP_TIMEOUT,
                )
  • Risk: py-multiaddr REGISTRY._locked toggling remains fragile if upstream changes internals (mitigated with hasattr and logging).
    Impact: Low — registration may skip; worst case is broken multiaddr parsing for WebRTC until protocols are registered upstream or manually.
    Mitigation: Track upstream py-multiaddr for first-class protocol registration.

  • Positive: Fingerprint parsing tightened (e.g. SHA-256 length validation for SDP-derived digests), Noise prologue binds to DTLS fingerprints, and WebRTCConnectionError subclasses OpenConnectionError for consistent dial error handling.


6. Documentation and Examples

  • Strengths: Sphinx apidoc pages, toctree wiring under docs/libp2p.transport.rst, autodoc_mock_imports for aiortc (helps Read the Docs without native SRTP build deps).
  • Gaps vs Add Support for WebRTC set-up in py-libp2p #546: Still no dedicated examples/webrtc/ walkthrough; issue acceptance criteria mention examples for developers.
    Suggestion: Either add a minimal example (even marked experimental) or explicitly document in getting_started / install how to install libp2p[webrtc], system deps (libsrtp2 where relevant), and enable_webrtc.

7. Newsfragment Requirement

  • Issue reference: Refs #546 present (does not auto-close; acceptable if intentional).
  • Newsfragment: newsfragments/546.feature.rst exists and ends with a newline.
  • Blocker: Content accuracy — as noted in §4 Major, the fragment text is out of date relative to wired Direct behavior. Update before merge so towncrier output matches reality.

8. Tests and Validation

All commands run from repo root with source .venv/bin/activate && … | tee downloads/AI-PR-REVIEWS/1309/<log>.log.

Step Result Notes
make lint PASS (exit 0) Log: lint_output.log — all pre-commit hooks Passed (ruff, ruff-format, mypy, pyrefly, etc.).
make typecheck PASS (exit 0) Log: typecheck_output.logmypy-local and pyrefly-local Passed.
make test PASS (exit 0) Log: test_output.log2804 passed, 18 skipped, 5 warnings in ~120.6s. Warnings: RuntimeWarning re AsyncMock coroutine never awaited (WebRTC tests); duplicate lines in pytest summary — treat as three warning sites.
make linux-docs PASS (exit 0) Log: docs_output.logsphinx-build -W succeeded. Noisy [DEBUG] multiaddr.transforms lines appear during doc build (observability only; not a failure).

9. Recommendations for Improvement

  1. Refresh newsfragments/546.feature.rst and the GitHub PR description so release notes and reviewer expectations match wired Direct + remaining gaps.
  2. Tighten the HTTP SDP server: max body, stricter method/path validation, tests for abuse cases.
  3. Reduce pyrefly exclude surface over time; keep mypy/pyrefly signal on new production modules.
  4. Evolve WebRTCDirectListener.listen to use the nursery per project structured concurrency patterns.
  5. Clean up WebRTC tests to eliminate AsyncMock “never awaited” warnings.
  6. Plan go/js interop and private /webrtc dial for follow-up PRs aligned with Add Support for WebRTC set-up in py-libp2p #546.

10. Questions for the Author

  1. Is the HTTP /sdp signaling server intended as a long-term companion to WebRTC Direct in py-libp2p, or a temporary test harness until spec-aligned signaling (or libp2p stream-based signaling) lands?
  2. Should Refs #546 become Fixes #546 only when interop + examples from the issue are done, or will you split Add Support for WebRTC set-up in py-libp2p #546 into sub-issues per milestone?
  3. WebRTCPrivateTransport: is the next milestone full relay signaling + aiortc, or interop-first on Direct only?

11. Overall Assessment

  • Quality rating: Good — large, coherent transport addition with strong tests and iterative hardening; some process/docs drift and typechecker excludes keep it from “excellent.”
  • Security impact: Low to Low–Medium (mostly robustness / DoS hardening on the small HTTP surface and operational complexity of WebRTC stacks).
  • Merge readiness: Needs re-review (human maintainers pacrob, acul71, seetadev) — local CI-equivalent checks are green, prior blockers appear resolved, but scope grew after earlier reviews; newsfragment + PR body should be corrected before merge.
  • Confidence: High on automated validation for this SHA; Medium on full production behavior until interop and private paths mature.

Maintainer feedback status

  • acul71 (Apr 14 & Apr 16): Swarm SwarmException / fallback regression, pyrefly, docs toctree, make_noise_channel_callbacks return type, IListener.listen, aiortc import / typing strategy — addressed in the current tree and reflected in passing lint/typecheck/tests/docs on 7df4db99.
  • Recommendation: Explicit maintainer ack on the final diff (especially _aiortc_helpers, HTTP signaling, and pyrefly excludes) before merge.

Addendum: Releasing WebRTC with pyrefly excludes still in place

After other PR blockers are resolved, a defensible PyPI release can still ship WebRTC while [tool.pyrefly] project_excludes omit the WebRTC modules listed in §4. Pyrefly exclusions do not remove runtime tests or crypto behavior; they only skip static typing on those paths. The following keeps risk and user expectations aligned.

Before tagging the release

  1. Optional and clearly early — Keep WebRTC behind libp2p[webrtc] and enable_webrtc (or equivalent). In release notes and docs, state what is supported today (e.g. WebRTC Direct and the current harness), what is not promised yet (e.g. private /webrtc dial, go/js interop), and system dependencies where relevant (e.g. SRTP / native libs for aiortc).
  2. Accurate user-facing text — Align towncrier fragments and the GitHub PR description with the actual code (avoid claiming RTCPeerConnection is still stubbed if Direct is wired).
  3. Targeted human review — Maintainer pass on _aiortc_helpers, listener, the trio/asyncio bridge, and cert / Noise / fingerprint paths, since pyrefly will not flag type-level mistakes there.
  4. Network-facing hardening — Before advertising any Internet-exposed listener: bound Content-Length, timeouts, and tests for oversized or malformed requests on the minimal HTTP SDP server (see §5).
  5. CI remains strict elsewhere — Keep pyrefly and mypy green on the rest of the tree; avoid dropping pyrefly globally. Prefer a short comment or linked issue next to excludes documenting why they exist and a follow-up to shrink them.

After the release (follow-up milestones)

  1. Technical debt — Track narrowing pyrefly excludes: stubs or shims for aiortc, tighter typing on the bridge, and typed fakes in tests so tests/core/transport/webrtc need not be excluded wholesale.
  2. Issue Add Support for WebRTC set-up in py-libp2p #546 alignment — Land interop and examples in subsequent releases so acceptance criteria and marketing stay honest.

Versioning

Prefer a minor version bump with WebRTC called experimental until pyrefly coverage, interop, and private-path work mature; tighten guarantees in a later minor when the excluded surface is reduced.

…1308

IListener.listen no longer takes a nursery parameter (landed in libp2p#1308),
so the WebRTC Direct and private listeners no longer need the
# type: ignore[override] pragma and the vestigial nursery parameter.

- WebRTCDirectListener.listen(maddr) — drop the nursery parameter and
  the type-ignore
- WebRTCPrivateListener.listen(maddr) — drop the nursery parameter and
  the now-unused self._nursery field (stream handler registration does
  not require a caller-provided nursery)
- Update the docstring example on WebRTCDirectTransport to match
@seetadev

Copy link
Copy Markdown
Contributor

Hey @yashksaini-coder, @acul71 : sharing a detailed update and next steps based on recent discussions:

Please take a close look at the guidance from @dozyio in the following threads:

These contain important implementation details and direction that we should align with, especially around the evolving handling of WebRTC transports.

Also, @lidel has put together a very helpful and concise summary of the recent changes and context around webrtc-direct here:

I strongly recommend going through that comment carefully, it gives a clear picture of what’s changing and why.

Important (FYSA / Priority):

The /webrtc-direct transport is effectively on the path to deprecation due to upcoming changes in Google Chrome. These browser-level changes will break or significantly limit current behavior, which means:

  • Existing assumptions around direct browser-to-browser connectivity may no longer hold
  • Implementations depending on /webrtc-direct could degrade or stop working
  • We need to reassess transport strategy (e.g., fallback mechanisms, alternative transports, spec alignment)

Action items:

  • Review the linked PRs and summary in detail
  • Identify impact on our current implementation(s)
  • Flag any blockers or gaps early
  • Prioritize this work accordingly, given the external dependency (Chrome changes)

This is worth escalating and treating as high priority since it affects core connectivity guarantees. Let’s sync soon after initial review and decide how we want to adapt our approach.

@yashksaini-coder

Copy link
Copy Markdown
Contributor Author

@acul71 @seetadev — following up on seetadev's webrtc-direct deprecation note. I went through the linked sources (specs#672, specs#715, js-libp2p#3480) and audited this PR against them. Sharing findings + two decisions I need from you before continuing on the inbound path.

Where the spec direction leaves us

  • This PR implements WebRTC-Direct v1 mechanics (sdp.py:_apply_ice_credentials is the documented seam for specs#672). v1 relies on munging the local offer's ICE ufrag/pwd — exactly what Chrome's WebRTC-NoSdpMangleUfrag field trial removes. So the browser-to-server path is the one being deprecated.
  • v2 (specs#715, ref impl js-libp2p#3480) is an ICE-credential-negotiation change layered on the same /webrtc-direct multiaddr + certhash + Noise handshake, selected by a libp2p+webrtc+v2/ ufrag prefix. v1 and v2 coexist on one dual-mode listener.
  • Implication: a correct v1 foundation is a direct prerequisite for v2 (not throwaway work). My proposal is to land this as explicitly-labeled v1 node-to-node scaffolding, not advertise browser dial, and track v2 as a follow-up. I've drafted a newsfragment update saying exactly this — please sanity-check that framing.

Fixes implemented and validated locally (TDD, full webrtc suite green) — will push once we align on direction

  1. Spec-compliant data-channel framing — each message is now uvarint-length-prefixed and the framed message respects the 16 KiB cap (previously sent a bare protobuf with no prefix → wire-incompatible with js/go-libp2p, and a full-size write serialized to 16388 B, over the limit).
  2. In-band data channels — application channels were created negotiated=True, which never fires the peer's datachannel event, so accept_stream() blocked forever. Switched to in-band channels; multiplexing now works end-to-end.
  3. DTLS certificate pinningcreate_peer_connection passed certificates= to aiortc's RTCConfiguration, which aiortc ≥1.5 rejects, so no RTCPeerConnection could be created at all. Fixed by pinning the cert before offer/answer.

These are validated by a new real two-PeerConnection loopback test (test_multiplexing_loopback.py): ICE → open stream on initiator → accept on responder → echo a >16 KiB payload both directions.

Two questions before I touch the inbound listener

1. Listener signaling: keep the HTTP POST /sdp server, or rebuild around stateless STUN reconstruction?
The current WebRTCDirectListener runs a small HTTP server and exchanges SDP via POST /sdp. Both v1 and v2 of webrtc-direct are explicitly no-signaling: the server reconstructs the client's offer from the incoming STUN connectivity-check packets (reading the ufrag out of the STUN USERNAME). So the HTTP-signaling listener isn't interoperable with any real js/go webrtc-direct dialer.
→ Is the HTTP server an intentional interim node-to-node bridge (fine to wire the Noise handshake onto for now), or should the listener be rebuilt around STUN-driven reconstruction before we invest further? This decides whether the inbound path is salvageable as-is or needs a redesign.

2. Does our ICE stack expose the STUN USERNAME halves?
v2 requires the server to parse the STUN USERNAME as server_ufrag:client_ufrag and branch on the libp2p+webrtc+v2/ prefix. js-libp2p needed node-datachannel ≥ 0.32.3 for this. Do we have a way (via aiortc / its libsrtp/ice layer) to observe the inbound STUN username, or is that a gap that blocks v2 entirely in py-libp2p?

The dead-code findings on the inbound path (Noise handshake never runs, handler never invoked, remote fingerprint is a zero placeholder) are real and I have fixes ready — the handler-wiring + real fingerprint part is needed regardless of the signaling answer. But I don't want to build it on the HTTP-signaling listener if (1) says that's getting replaced. Happy to hop on a call if that's faster.

@acul71

acul71 commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

@yashksaini-coder — thanks for the thorough audit against specs#672, specs#715, and js-libp2p#3480. Your reading matches where the ecosystem is headed. Answers below on framing, your two questions, and what we need before merge.

Framing — agreed

Your proposal is the right one for this PR:

  • Land as explicitly experimental v1 node-to-node scaffolding (libp2p[webrtc], enable_webrtc).
  • Do not advertise browser dial — v1 munging is on the way out with Chrome’s WebRTC-NoSdpMangleUfrag trial.
  • v2 (specs#715, @lidel’s 3587339 hardening) is the browser-compatible target on the same /webrtc-direct multiaddr, with version dispatch via the libp2p+webrtc+v1/ / libp2p+webrtc+v2/ ufrag prefixes.
  • The v1 foundation here (framing, Noise prologue, certhash, _apply_ice_credentials() as a seam) is not throwaway — it feeds v2, but _apply_ice_credentials() will likely evolve into a version-dispatch layer rather than a single swap.

Newsfragment: yes, sanity-check passed on that framing. Please update 546.feature.rst and the PR description to say experimental, what works today, and what is explicitly out of scope (browser dial, go/js interop, private /webrtc dial). Keep Refs #546 — we should split #546 into milestones rather than Fixes #546 on this PR.

Your local fixes — please push

The three items you listed (uvarint framing, in-band app channels, cert pinning) plus test_multiplexing_loopback.py address real Critical gaps on the current branch. Please push them as the next commits — they are prerequisites for merge consideration, not something to hold while we debate listener architecture.

On cert pinning: use whichever approach is reliable across aiortc>=1.5,<2 in CI; document it in a short code comment and keep a regression test so we don’t break it again.

Q1 — HTTP /sdp vs STUN reconstruction

Do not invest further in the HTTP inbound path.

webrtc-direct (v1 and v2) is no-signaling: the server infers the client offer from inbound STUN USERNAME. The HTTP POST /sdp server was a useful early harness for outbound dial / loopback, but it is not interoperable with js/go webrtc-direct dialers and should not be the base for inbound work.

Direction:

  • Treat existing HTTP code as a temporary dev harness (or remove it once a STUN path exists).
  • Design the listener around STUN USERNAME parsing and version-prefix dispatch per specs#715.
  • Outbound over HTTP may stay temporarily for py-to-py loopback only, clearly documented as non-interop, until we have a spec-aligned py-to-py dial path.

So: your dead-code fixes (real fingerprint, Noise handshake, handler wiring) are still needed — but on the STUN listener design, not on top of HTTP /sdp.

Q2 — aiortc STUN USERNAME exposure

Unknown today — open a spike; not a blocker for landing experimental scaffolding.

Go/JS solved this at the native ICE layer (node-datachannel#420 exposes localUfrag on IceUdpMuxRequest). I don’t know yet whether aiortc gives us equivalent hooks for inbound STUN parsing and server-side credential override (v2 step 6: set local ufrag/pwd to server_ufrag before generating the answer).

Please open a tracked spike issue (sub-issue of #546 is fine): “aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener.” That spike blocks claiming browser-dial or v2 listener support — it does not block merging this PR as experimental scaffolding once the framing/channel fixes land.

Merge gates (current)

Item Status
CI ✅ already green
Push framing + in-band channels + loopback test ❌ required
Update newsfragment + PR description (experimental) ❌ required
STUN-based inbound listener ⏳ follow-up after aiortc spike
pyrefly excludes ✅ acceptable for experimental release; file follow-up to narrow

Once the pushed fixes and doc updates are in, ping for re-review. Happy to sync on a call if the STUN listener spike needs a deeper dive.

…ming

Each data-channel message is now a single protobuf prefixed with an
unsigned varint length, per the libp2p WebRTC spec.  Both sides chunk
writes at MAX_PAYLOAD_SIZE so the full framed wire payload (uvarint +
protobuf overhead + chunk) stays within the 16 KiB SCTP cap.

- _varint.py: shared encode_uvarint / decode_uvarint, used by both
  data-channel framing and signaling (de-dups signaling._encode_uvarint).
- stream.py: _send_message frames, on_data parses uvarint then protobuf,
  warns on truncated frames and on trailing bytes after the framed proto.
- constants.MAX_PAYLOAD_SIZE = 16_368, conservative chunk size leaving
  margin for the inner protobuf field tag + uvarint + outer prefix.

Without this change the connection wrote bare protobufs with no length
prefix and routinely produced 16388-byte SCTP messages, violating the
16 KiB ceiling and making the wire format incompatible with the
js-libp2p and go-libp2p implementations.

Refs libp2p#546
App-level streams now use in-band data channels (drop `negotiated=True,
id=channel_id` from createDataChannel).  The peer's `datachannel` event
fires for each new channel — previously WebRTCConnection.accept_stream()
would block forever because pre-negotiated channels skip that event.

Disjoint local ID spaces: outbound channels get even libp2p IDs starting
at 2; inbound channels get odd IDs starting at 1.  These are local
routing keys only; SCTP allocates wire IDs via DCEP.

`_send_on_channel` now awaits the channel reaching readyState='open'
before invoking ch.send(), with a 30s timeout.  Closes a real product
bug: any caller that wrote immediately after open_stream() would hit
aiortc's InvalidStateError because in-band channels have a non-trivial
window between createDataChannel and SCTP DCEP open.

`_on_datachannel` catches WebRTCStreamError from _allocate_inbound_id
(raised at max_concurrent_streams), closes the orphaned channel, and
logs.  Without this an over-limit channel would sit open with no
handlers and silently swallow remote writes.

Noise channel (id=0, negotiated=True) is unchanged — pre-negotiation is
required for handshake before SCTP DCEP can run.

Refs libp2p#546
aiortc >= 1.5 dropped `certificates=` from RTCConfiguration; passing it
now raises TypeError.  Going through the public attribute pc._certificates
silently fails to pin: aiortc actually reads self.__certificates inside
the class, which Python mangles to self._RTCPeerConnection__certificates
(see aiortc/rtcpeerconnection.py:295 and :1129).

Set the mangled attribute directly after construction, before
createOffer/createAnswer reads it for the SDP a=fingerprint line.

Without this fix the /webrtc-direct/certhash/<...> multiaddr advertised
the libp2p-generated cert hash, but DTLS handshakes used aiortc's
auto-generated cert — every real dial failed peer-verification with
'Remote DTLS fingerprint does not match certhash'.

Refs libp2p#546
test_multiplexing_loopback.py: real two-RTCPeerConnection echo test
exercising the framing, in-band channel, and cert-pin fixes end-to-end
through aiortc.  Three cases:

  - test_cert_pinning_lands_in_sdp_fingerprint: asserts the pinned
    cert's SHA-256 fingerprint appears in pc.localDescription.sdp.
    Canary against the class of bug where pinning is a no-op (e.g.
    writing to the un-mangled pc._certificates that aiortc never reads)
    — the echo path itself would still pass without this assertion.

  - test_open_accept_echo_large_payload: open a stream on one PC,
    accept on the other, round-trip a payload > MAX_PAYLOAD_SIZE
    both directions.  Fails closed if chunking regresses to
    MAX_MESSAGE_SIZE and pushes the wire over 16 KiB, or if anyone
    drops the uvarint prefix.

  - test_multiple_concurrent_streams: validates the disjoint
    outbound/inbound id-space split introduced with in-band channels.

newsfragments/546.feature.rst: reframe per acul71's merge gates —
this PR lands experimental v1 node-to-node scaffolding.  Explicit
out-of-scope list (browser dial, go/js interop, private /webrtc dial,
inbound handler wiring) so reviewers and users can't mistake it for
production-ready.

Refs libp2p#546
@yashksaini-coder

Copy link
Copy Markdown
Contributor Author

@acul71 — pushed the four critical-path commits per your framing, each as its own commit so they're easy to review independently:

Commit Subject
c04b4894 feat(webrtc): spec-compliant uvarint length-prefixed data-channel framing
397072de fix(webrtc): in-band data channels with DCEP open wait
ca8331ff fix(webrtc): pin DTLS certificate via aiortc's mangled private slot
6a9d408e test+docs(webrtc): two-PC loopback + experimental v1 newsfragment

A few worth flagging:

  • Cert pin name-mangling. A pre-push code review caught that the obvious pc._certificates = [rtc_cert] is a silent no-op — aiortc reads only self.__certificates, which mangles to pc._RTCPeerConnection__certificates (see aiortc/rtcpeerconnection.py:295,1129). Without this every real WebRTC Direct dial would have failed certhash verification while the loopback echo would still pass. The mangled-slot pin is in place and test_cert_pinning_lands_in_sdp_fingerprint asserts the cert's fingerprint actually appears in pc.localDescription.sdp as a canary.
  • In-band channels needed a DCEP open wait. Once negotiated=True is dropped, there's a real window between createDataChannel and the SCTP DCEP open event where ch.send() raises InvalidStateError. _send_on_channel now awaits the channel reaching readyState == "open" (30s timeout, no more indefinite hang) before invoking send, and _on_datachannel catches _allocate_inbound_id raising at max_concurrent_streams and closes the orphaned channel instead of letting it sit half-dead.
  • Newsfragment + PR description are reframed to the experimental v1 framing — explicit "out of scope" list covers browser dial, go/js interop, private /webrtc dial, and full inbound handler wiring.
  • Spike for aiortc STUN USERNAME exposure — I'll file the sub-issue under Add Support for WebRTC set-up in py-libp2p #546 shortly (it'll cover the 3 capabilities: inbound STUN parsing, server-side credential override, and ICE mux). Tracking that separately keeps it from blocking this PR.
  • CI was green on every commit individually pre-push (locally 153 / 153 webrtc tests, ruff clean, pyrefly clean on the non-excluded modules).

Ready for re-review when you have a moment. Happy to hop on a call if the STUN listener spike needs a deeper dive.

@acul71

acul71 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

@yashksaini-coder — re-reviewed after the four commits (c04b48946a9d408e). Good to merge as experimental v1 scaffolding once one small fix is in.

✅ Satisfied (June 23 merge gates)

🔧 One nit before merge

tests/core/transport/webrtc/test_webrtc_direct_loopback.py — the module docstring still claims the full lifecycle (HTTP SDP → ICE → DTLS → Noise → stream echo). The file only tests listener multiaddr advertisement, port binding, and cert attachment. Please update the docstring to match what the tests actually do (or add a transport-level dial→echo test in a follow-up and say so explicitly in the docstring).

Suggested docstring:

"""
Integration tests for WebRTC Direct listener setup (aiortc required).

Verifies that a listener advertises a multiaddr with /certhash/ and /p2p/,
binds a real UDP port when port 0 is requested, and uses an aiortc-native
certificate. Does not exercise dial(), Noise, or stream echo — see
test_multiplexing_loopback.py for data-channel layer loopback and add a
transport-level loopback test in a follow-up.
"""

Scope on merge (for the record)

Merging as experimental behind libp2p[webrtc] + enable_webrtc:

Ping me when the docstring is pushed and I'll merge.

A fuller AI-assisted review is in the comment below for reference.

@acul71

acul71 commented Jun 26, 2026

Copy link
Copy Markdown
Contributor
AI PR Review #4 (2026-06-26, branch webrtc @ 6a9d408e) — click to expand

AI PR Review: #1309

Reviewed branch: webrtc @ 6a9d408e82a8b543225aa713f3bc02039849bd85
Date: 2026-06-26
Prior review: AI-PR-REVIEW-1309-3 (2026-06-23 @ a0dd62e8)


1. Summary of Changes

This pull request adds experimental WebRTC transport scaffolding (libp2p.transport.webrtc) for WebRTC Direct (/webrtc-direct) and relay-signaled private WebRTC (/webrtc), enabled via the optional libp2p[webrtc] extra (aiortc) and enable_webrtc on new_swarm / new_host.

Since review #3 (four commits, 2026-06-26):

Commit Change
c04b4894 Spec-compliant uvarint length-prefixed data-channel framing (_varint.py, _frame() in stream.py)
397072de In-band application data channels + DCEP open wait (30s timeout)
ca8331ff DTLS cert pinning via aiortc's _RTCPeerConnection__certificates mangled slot
6a9d408e test_multiplexing_loopback.py (two-PC echo) + experimental v1 newsfragments/546.feature.rst

Issue linkage: Refs #546. This PR substantially advances #546 as experimental v1 node-to-node scaffolding but does not satisfy full acceptance criteria (go/js interop, browser dial, private /webrtc dial, examples, STUN-based inbound listener).

Breaking changes: None. New optional ITransport.provides_native_muxing defaults to False (backward-compatible).


2. Branch Sync Status and Merge Conflicts

  • 0 commits behind, 38 commits ahead of origin/main
  • No merge conflicts detected

3. Strengths

  • Critical-path fixes landed (framing, in-band channels, cert pinning, loopback tests)
  • Spec-aligned structure (framing, ICE_DONE, Noise prologue, SDP seam)
  • Concurrency safety (Trio/asyncio bridge)
  • Outbound dial() wired end-to-end
  • Newsfragment + PR body accurately describe experimental scope
  • 153 WebRTC tests; full CI green

4. Issues Found

Critical

None on reviewed HEAD. Review #3 Critical items (bare protobuf framing, negotiated=True app channels, silent cert-pin no-op) resolved in c04b4894ca8331ff.

Major

  • listener.py (219–248): Inbound path incomplete — placeholder fingerprint, no Noise/handler wiring. Follow-up on STUN path (spike: aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener #1352); do not extend HTTP /sdp.
  • _aiortc_helpers.py (391–397): HTTP SDP server — unbounded Content-Length (availability risk).
  • test_webrtc_direct_loopback.py (docstring): Overstates coverage — blocking nit before merge (see comment above).
  • _aiortc_helpers.py (74–76): Cert pinning via aiortc private mangled slot — fragile but canary-tested.
  • private_transport.py: dial() still NotImplementedError.
  • pyproject.toml: pyrefly excludes on WebRTC modules — acceptable for experimental release.

Minor

  • HTTP /sdp non-interop harness; 6 tests skipped without aiortc locally; noisy multiaddr debug in docs build.

5. Security Review

  • Inbound placeholder fingerprint: latent (path not executed)
  • HTTP unbounded body: medium availability risk
  • aiortc private slot: low–medium maintenance risk
  • Outbound certhash verification: positive

6. Documentation and Examples

  • Sphinx apidoc present; newsfragment accurate
  • Gaps: no examples/webrtc/, install guide for libp2p[webrtc] + libsrtp2

7. Newsfragment Requirement

newsfragments/546.feature.rst — accurate, experimental scope. Refs #546 appropriate.


8. Tests and Validation

Step Result
make lint PASS
make typecheck PASS
make test 3035 passed, 22 skipped
WebRTC suite (with aiortc) 153 passed
make linux-docs PASS
GitHub CI All green

9. Recommendations

  1. Merge as experimental after docstring fix (maintainer decision)
  2. STUN spike: spike: aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener #1352 ✅ filed
  3. Inbound listener on STUN path — follow-up
  4. Fix or extend test_webrtc_direct_loopback.py
  5. Harden HTTP SDP server (optional follow-up)
  6. Add examples / install docs before Add Support for WebRTC set-up in py-libp2p #546 substantially complete

10. Questions for the Author

  1. Transport-level loopback test — this PR or follow-up?
  2. STUN spike — ✅ spike: aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener #1352 filed
  3. HTTP harness removal timeline?
  4. Migrate off mangled cert slot if aiortc adds public API?

11. Overall Assessment

Quality Good (experimental v1 scaffolding)
Security Low–Medium
Merge readiness Ready after docstring nit (experimental)
Confidence High on wire fixes; Medium on long-term aiortc internals

Delta from review #3

Item Review #3 Review #4
Uvarint framing ❌ Critical ✅ Fixed
In-band channels ❌ Critical ✅ Fixed
Cert pinning ❌ Critical ✅ Fixed
Loopback test ❌ Absent ✅ Present
Newsfragment ❌ Stale ✅ Updated
Inbound listener ❌ follow-up
Merge readiness Needs fixes Ready after docstring

AI-assisted review using project prompt py-libp2p-pr-review-prompt.md. Local logs: downloads/AI-PR-REVIEWS/1309/.

Per acul71's pre-merge review nit on libp2p#1309: the previous docstring
claimed full lifecycle coverage (HTTP SDP -> ICE -> DTLS -> Noise ->
stream echo), but this file only verifies listener setup — multiaddr
advertisement with /certhash/ + /p2p/, real-port binding on port 0,
and aiortc-native certificate attachment.

The data-channel layer loopback now lives in test_multiplexing_loopback.py;
a transport-level dial->Noise->echo test is a documented follow-up.

Refs libp2p#546.
@yashksaini-coder

Copy link
Copy Markdown
Contributor Author

@acul71 — docstring nit fixed in ed645324. Used your suggested wording verbatim; webrtc suite still green (153 / 153, ruff + format clean).

Quick acks on the AI Review #4 follow-ups so they don't get lost:

  • HTTP /sdp unbounded Content-Length — agreed it's an availability risk; will address in the listener-rework follow-up tracked alongside spike: aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener #1352 (the STUN listener replaces the HTTP harness wholesale, so a one-off cap on the current path would be throwaway work).
  • Transport-level dial → Noise → echo test — happy to land that as the next follow-up once spike: aiortc ICE mux / STUN USERNAME exposure for webrtc-direct v2 listener #1352's spike clarifies whether the inbound STUN path is feasible. Will also remove the explicit "follow-up" reference from test_webrtc_direct_loopback.py's docstring then.
  • Cert pin via _RTCPeerConnection__certificates — keeping an eye on aiortc upstream; will migrate the moment they add a public setter (and the canary test will fail loudly if they ever rename the mangled slot).

Ready to merge whenever you are.

@yashksaini-coder

yashksaini-coder commented Jun 28, 2026

Copy link
Copy Markdown
Contributor Author

Tracked the HTTP /sdp hardening follow-up as #1354 (sub-issue of #546) so it's not lost when the thread closes — covers the body-size + header-loop bounds from AI Review #4 Major #2. @acul71 can we have review on this pr and get it ready for final merge

@acul71

acul71 commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

@yashksaini-coder @seetadev
Merging this as we can have experimental WebRTC , and follow-up issues to complete the protocol, are already in place

@acul71 acul71 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this as experimental WebRTC scaffolding behind libp2p[webrtc] + enable_webrtc. The critical-path fixes from the re-review (uvarint framing, in-band channels + DCEP wait, cert pinning, loopback tests, docstring/newsfragment accuracy) are in place, and CI is green.

#546 stays open (Refs #546 — do not auto-close on merge). This PR advances the parent issue but does not satisfy full acceptance criteria (go/js interop, browser dial, private /webrtc dial, STUN-based inbound listener, examples).

Follow-up issues are already tracked:

Issue Scope
#546 Parent: full WebRTC transport (interop, NAT, docs, examples)
#1352 Spike: aiortc ICE mux / STUN USERNAME exposure for spec-aligned v2 inbound listener
#1354 Harden: bound Content-Length and header reads on HTTP /sdp dev harness
#1355 Refactor: deduplicate _varint.py — reuse libp2p.utils.varint
#1356 Deps: bump py-multiaddr to ≥0.2.0 (blocked by p2pclient multiaddr<0.1.0 cap); enables removing WebRTC multiaddr registration workaround

Explicitly out of scope for this merge (no dedicated sub-issues yet, or covered under #546 / #1352):

  • Private /webrtc dial (WebRTCPrivateTransport.dial() still NotImplementedError)
  • Transport-level dial → Noise → stream echo test (after #1352 spike)
  • go-libp2p / js-libp2p WebRTC Direct interop
  • Browser dial (v2 / libp2p/specs#715)

LGTM for merge as experimental v1 node-to-node scaffolding.

@acul71 acul71 merged commit 39a3b6c into libp2p:main Jun 28, 2026
38 checks passed
@acul71 acul71 deleted the webrtc branch June 28, 2026 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants