Skip to content

fix(socket): follow HTTP 3xx during WebSocket handshake (OPENHUMAN-TAURI-9X)#1547

Merged
senamakel merged 5 commits into
tinyhumansai:mainfrom
oxoxDev:fix/ws-transient-noise
May 13, 2026
Merged

fix(socket): follow HTTP 3xx during WebSocket handshake (OPENHUMAN-TAURI-9X)#1547
senamakel merged 5 commits into
tinyhumansai:mainfrom
oxoxDev:fix/ws-transient-noise

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented May 12, 2026

Summary

  • Real fix for OPENHUMAN-TAURI-9X: tokio_tungstenite::connect_async doesn't follow HTTP redirects, so a stale http:// BACKEND_URL behind a Cloudflare-style edge that 301s to https:// used to be permanently fatal — the reconnect loop hammered the same dead URL at error level forever (20 occurrences in 4 minutes against a single user).
  • Replaces the previous err→warn demotion approach with redirect-following: the loop now resolves Location, upgrades the scheme (httpws, httpswss), and reconnects, pinning the resolved URL across reconnects so subsequent backoff iterations skip the redirect round-trip.
  • Surfaces a one-shot warning through SocketState.error so the UI/observer can flag that BACKEND_URL is stale — keeps the actionable signal without the recurring error noise.
  • 4xx / TLS / URL / protocol failures still fall through to log::error! + exponential backoff unchanged — nothing is silenced.

Problem

  • OPENHUMAN-TAURI-9X (WebSocket connect: HTTP error: 301 Moved Permanently, 20 events in 4 minutes) was emitted from src/openhuman/socket/ws_loop.rs:62 every retry attempt.
  • Root cause confirmed by probing api.tinyhumans.ai directly: curl -I http://api.tinyhumans.ai/socket.io/... returns HTTP/1.1 301 Moved Permanently → Location: https://api.tinyhumans.ai:443/socket.io/?EIO=4&transport=websocket. Any client whose BACKEND_URL is http:// (or whose desktop config drifted there) is gated on this 301 forever — Cloudflare advertises an HTTPS-only socket endpoint.
  • tokio-tungstenite 0.24's connect_async returns WsError::Http for any non-101 upgrade response; there is no built-in redirect-follower. The loop's only knob is exponential backoff, which never recovers from a permanent redirect.
  • The companion issue OPENHUMAN-TAURI-7N (IO error: Operation timed out, 1 event) is a true transient — the existing backoff path handles it. No demotion needed; it'll fall back to error! only when retries exhaust, which is correct.

Solution

  • ws_loop now holds a mutable ws_url: String that survives backoff iterations. After a successful redirect-follow on attempt N, attempt N+1 connects directly to the resolved URL with no extra round-trip.
  • New connect_with_redirects helper in ws_loop.rs:
    • Walks up to MAX_REDIRECT_HOPS = 3 per attempt.
    • Resolves Location against the current URL via url::Url::join so relative paths (RFC 7230) work the same as absolute ones.
    • Upgrades scheme (httpws, httpswss; existing ws/wss pass through; anything else surfaces an error).
    • Returns the wrapped WsStream on success or the underlying WsError on failure — non-redirect errors fall through unchanged.
  • SharedState grows an error: RwLock<Option<String>> field exposed via SocketState.error (was previously hard-coded to None). Recorded once per connect attempt the first time we follow a redirect, cleared on connect() entry and on disconnect(), persisted across successful handshakes so the user still sees the "your config is stale" hint after the loop recovers.
  • No changes to backoff timing, shutdown semantics, or the EIO/SIO handshake — those paths are byte-identical to main.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy — 9 new #[test]s covering scheme upgrade, relative Location, unsupported scheme, status-filter, header extraction, one-shot warning recording, full end-to-end 301-then-handshake via a local mock, and the missing-Location failure mode.
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
  • Coverage matrix updated — N/A: observability/recovery-only change to socket reconnect path; no feature matrix row applies.
  • All affected feature IDs from the matrix are listed in the PR description under ## Related — N/A: no feature matrix ID applies.
  • No new external network dependencies introduced (mock backend used per Testing Strategy) — new tests stand up local 127.0.0.1 listeners (one HTTP 301 responder + the existing Engine.IO mock).
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md) — N/A: observability-only core socket change, no manual smoke flow.
  • Linked issue closed via Closes #NNN in the ## Related section — Sentry issue ID below; meta-tracker Track and fix active Sentry issues #1472 remains open for the broader sweep.

Impact

  • Desktop/core runtime only.
  • Users whose BACKEND_URL is configured (or has drifted to) http:// now connect successfully on the first attempt instead of looping at error level. SocketState.error will carry a "Backend redirected … — update BACKEND_URL" warning until next connect() / disconnect(), so observers (UI / Sentry breadcrumbs / state subscribers) can surface the stale-config signal.
  • 4xx / TLS / URL / protocol failures still log at error! and feed the existing backoff — nothing is silenced.
  • No migration, schema change, or external dependency impact.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/ws-transient-noise
  • Commit SHA: c1724c18712c19a08496cd055efb011299684dfa

Validation Run

  • pnpm --filter openhuman-app format:check — N/A: no frontend changes.
  • pnpm typecheck — N/A: no frontend changes.
  • Focused tests: cargo test --lib openhuman::socket — 70 tests passed (was 60 on main; +10 covering new redirect behavior + 1 fixture-error test on SocketManager).
  • Rust fmt/check (if changed): cargo fmt --check clean on changed files; cargo check --lib clean (only pre-existing unrelated warnings).
  • Tauri fmt/check (if changed): N/A — no Tauri shell changes.

Validation Blocked

  • command: cargo clippy --workspace -- -D warnings
  • error: 80 pre-existing clippy errors on main across unrelated modules (rand deprecations, type_complexity, derivable_impls, manual_is_multiple_of, redundant_guards, etc.). Filtering clippy output to the changed socket files surfaces zero issues.
  • impact: Socket files compile clean under clippy; full-workspace -D warnings gate is pre-existing breakage.

Behavior Changes

  • Intended behavior change: WebSocket connect now follows HTTP 3xx redirects (301/302/307/308) up to 3 hops per attempt, with scheme upgrade and a one-shot warning surfaced through SocketState.error.
  • User-visible effect: clients with a stale BACKEND_URL (e.g. http:// instead of https://) now connect successfully on first try instead of looping at error level forever; SocketState.error reports "Backend redirected … — update BACKEND_URL" until next disconnect.

Parity Contract

  • Legacy behavior preserved: reconnect loop, exponential backoff timing, shutdown semantics, EIO/SIO handshake bytes, and error!-level logging for actionable failures (URL/protocol/4xx/TLS).
  • Guard/fallback/dispatch parity checks: MAX_REDIRECT_HOPS caps loop length; non-redirect errors fall through to existing code paths; missing-Location case is rejected and surfaces the underlying HTTP error to the reconnect loop.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: N/A
  • Resolution: This PR was originally a noise-demotion patch (err→warn); force-pushed with the actual root-cause fix (redirect-following) so the stale-URL bug is resolved instead of muted.

Summary by CodeRabbit

  • New Features

    • WebSocket handshakes now follow HTTP redirects (with hop limit) and pin resolved targets for reconnects.
    • Connection warnings/errors are persisted and surfaced in connection state.
  • Tests

    • Added tests for redirect resolution, scheme handling, redirect-warning semantics, and persisted error visibility.

Review Change Stack

@oxoxDev oxoxDev requested a review from a team May 12, 2026 12:14
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b75f20ba-4de0-476d-879c-8dfd6e162eed

📥 Commits

Reviewing files that changed from the base of the PR and between 71dcd78 and ca3954f.

📒 Files selected for processing (1)
  • src/openhuman/socket/ws_loop.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/socket/ws_loop.rs

📝 Walkthrough

Walkthrough

This change introduces HTTP redirect following during WebSocket upgrade and persistent error state tracking. The socket manager now stores user-visible connection warnings, exposing them via state queries and clearing them on connect/disconnect lifecycle events. WebSocket upgrade follows HTTP 301/302/307/308 redirects up to a configured hop limit, records a one-shot warning for followed permanent redirects, and pins the resolved URL for subsequent reconnect attempts.

Changes

WebSocket Redirect and Error Handling

Layer / File(s) Summary
Shared Error Field Definition
src/openhuman/socket/manager.rs
SharedState gains error: RwLock<Option<String>> field initialized to None in SocketManager::new().
Error State Lifecycle Management
src/openhuman/socket/manager.rs
get_state() surfaces the stored error from shared state, connect() clears any prior error before attempting connection, and disconnect() clears it after status transition.
Test Fixtures and Error Tests
src/openhuman/socket/event_handlers.rs, src/openhuman/socket/manager.rs, src/openhuman/socket/ws_loop_tests.rs
Test fixture make_shared() is updated across files to initialize the error field. Unit test verifies get_state() surfaces stored error. Safety test fixtures for state-change and server-event logging updated.
Mutable WebSocket URL and Connection Entry Point
src/openhuman/socket/ws_loop.rs
Imports restructured and MAX_REDIRECT_HOPS added. ws_loop maintains mutable ws_url initialized from resolved WebSocket URL and passes &mut ws_url to run_connection to enable in-place mutation across reconnect attempts.
Redirect Resolution and Following Helpers
src/openhuman/socket/ws_loop.rs
Adds connect_with_redirects and helpers to follow HTTP 3xx during WebSocket upgrade: classify followable statuses, extract/resolve Location (relative & absolute), upgrade schemes (http→ws, https→wss), update ws_url in-place, retry up to the hop limit, and record a one-shot redirect warning in shared.error for permanent redirects when unset.
Redirect Helper Unit Tests
src/openhuman/socket/ws_loop_tests.rs
Unit tests validate scheme upgrades for absolute and relative Location, scheme preservation for ws/wss, unsupported-scheme rejection, redirect-status classification, extract_location_header, and one-shot warning recording semantics pinned to the first hop.
Redirect End-to-End Integration Tests
src/openhuman/socket/ws_loop_tests.rs
Adds spawn_mock_301_redirect helper and end-to-end tests verifying ws_loop follows 301 redirects to a working backend and records a one-shot warning; verifies 301 without Location fails without recording a warning.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🐰 I hopped through URLs and traced each Location,

pinned the path and left a tiny notation.
A one-shot warning sleeps where redirects were found,
so future reconnects reuse the path I found. 🌐

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main change: adding HTTP 3xx redirect following during WebSocket handshake. It is concise, specific, and directly reflects the primary purpose of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/socket/ws_loop.rs`:
- Around line 191-192: The code clears shared.error immediately after
connect_async returns, which can drop a persisted warning before the EIO/SIO
handshake completes; instead, remove the early assignment that sets shared.error
to None right after connect_async and only clear shared.error in the
established/connected-state block (the existing place later where shared.error
is set to None after the handshake succeeds) so errors persist until the full
handshake completes; locate uses of shared.error and the connect_async result in
ws_loop.rs and delete/move the early clear to the
connected-state/handshake-success section.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dfc37142-e77a-4363-a808-e53da0e916bd

📥 Commits

Reviewing files that changed from the base of the PR and between 7ce3193 and a6f5188.

📒 Files selected for processing (5)
  • src/openhuman/socket/event_handlers.rs
  • src/openhuman/socket/manager.rs
  • src/openhuman/socket/types.rs
  • src/openhuman/socket/ws_loop.rs
  • src/openhuman/socket/ws_loop_tests.rs

Comment thread src/openhuman/socket/ws_loop.rs Outdated
oxoxDev and others added 2 commits May 12, 2026 18:00
…#1547)

`tokio_tungstenite::connect_async` returns an error on any non-101
upgrade response, so an HTTP 3xx (typical when `BACKEND_URL` is plain
`http://...` and Cloudflare 301s to `https://`) used to be fatal — the
reconnect loop hammered the same dead URL at error level forever
(OPENHUMAN-TAURI-9X, 20 occurrences in 4 minutes).

Resolve the bug instead of muting it:

- Track a mutable `ws_url` across the reconnect loop so the resolved
  target survives backoff iterations.
- New `connect_with_redirects` walks up to 3 HTTP 3xx hops per attempt:
  resolves `Location` against the current URL (handles relative paths),
  upgrades `http`→`ws` / `https`→`wss`, and retries. Non-redirect
  failures fall through to the existing backoff path unchanged.
- Surface a one-shot warning through `SharedState.error` →
  `SocketState.error` so the UI can prompt the user to update
  `BACKEND_URL` to the resolved target.

Closes OPENHUMAN-TAURI-9X.
Refs tinyhumansai#1472.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds focused unit coverage for the new helpers in `ws_loop.rs`
(absolute / relative / unsupported Location handling, scheme upgrade,
followable-status filter, one-shot warning) plus a driver-level test
that stands up an HTTP 301 server in front of the existing Engine.IO
mock and asserts the loop follows the redirect, completes the SIO
handshake against the redirect target, and records the user-actionable
warning in `SharedState.error`. A separate test exercises the
no-Location failure mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oxoxDev oxoxDev force-pushed the fix/ws-transient-noise branch from a6f5188 to c1724c1 Compare May 12, 2026 12:31
@oxoxDev oxoxDev changed the title fix(socket): demote transient WebSocket connect failures from error to warn (OPENHUMAN-TAURI-9X, -7N) fix(socket): follow HTTP 3xx during WebSocket handshake (OPENHUMAN-TAURI-9X) May 12, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/socket/ws_loop_tests.rs`:
- Around line 625-629: The test currently evaluates matches!(err,
WsError::Http(_)) as a no-op; change it to assert the variant explicitly by
asserting that err matches WsError::Http(_) (e.g., use assert!(matches!(err,
WsError::Http(_))) and include err in the assertion message for debugging).
Locate the call site around connect_with_redirects and the err binding in
ws_loop_tests.rs and replace the no-op matches! expression with an assert that
verifies the error is WsError::Http(_) (and prints err on failure).

In `@src/openhuman/socket/ws_loop.rs`:
- Around line 488-495: Only emit the stale-BACKEND_URL warning and call
record_redirect_warning when the redirect is permanent (301 or 308). Modify the
redirect-handling block around ws_url and response.status() so it checks for
permanent redirects (e.g., response.status() == StatusCode::MOVED_PERMANENTLY ||
response.status() == StatusCode::PERMANENT_REDIRECT or compare as_u16() ==
301/308) and only then log the "Server redirected ... instruct users to update
config" warning and call record_redirect_warning(shared, &original, &next_url);
for non-permanent statuses (302/307) still follow the redirect and update
*ws_url = next_url but do not emit the stale-BACKEND_URL warning.
- Around line 133-136: The code currently flattens a WsError to a string when
connect_with_redirects fails; instead return a structured failure so callers can
inspect retryability/severity. Modify the match to return a ConnectionOutcome
variant that carries the original WsError or a new ConnectionFailure struct
(e.g., change ConnectionOutcome::Failed(String) to
ConnectionOutcome::Failed(ConnectionFailure) or
ConnectionOutcome::Failed(WsError)), and add a helper
classify_connection_failure(err: WsError) -> ConnectionFailure that maps
WsError::{Http, Io, Tls, Url, Protocol, ...} into typed metadata (retryable,
severity, code). Update call sites that construct or pattern-match
ConnectionOutcome accordingly and use classify_connection_failure(connect_error)
rather than formatting the error to a string; keep connect_with_redirects and
WsError types intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5a57fb8c-526a-481d-9b80-b06d7c2e816e

📥 Commits

Reviewing files that changed from the base of the PR and between a6f5188 and c1724c1.

📒 Files selected for processing (4)
  • src/openhuman/socket/event_handlers.rs
  • src/openhuman/socket/manager.rs
  • src/openhuman/socket/ws_loop.rs
  • src/openhuman/socket/ws_loop_tests.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/socket/event_handlers.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/socket/manager.rs

Comment thread src/openhuman/socket/ws_loop_tests.rs
Comment on lines +133 to 136
let ws_stream = match connect_with_redirects(ws_url, shared).await {
Ok(stream) => stream,
Err(e) => return ConnectionOutcome::Failed(format!("WebSocket connect: {e}")),
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Preserve structured connect failure metadata instead of flattening to string.

At Line 135, WsError is converted to a plain string immediately, which blocks downstream retryable/non-retryable severity handling from using concrete error shape.

💡 Suggested direction
-    let ws_stream = match connect_with_redirects(ws_url, shared).await {
+    let ws_stream = match connect_with_redirects(ws_url, shared).await {
         Ok(stream) => stream,
-        Err(e) => return ConnectionOutcome::Failed(format!("WebSocket connect: {e}")),
+        Err(e) => return ConnectionOutcome::Failed(classify_connection_failure(e)),
     };
// Example: keep typed metadata for log-level/routing decisions
fn classify_connection_failure(err: WsError) -> ConnectionFailure {
    // map WsError::{Http,Io,Tls,Url,Protocol,...} -> typed failure + retryability
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/socket/ws_loop.rs` around lines 133 - 136, The code currently
flattens a WsError to a string when connect_with_redirects fails; instead return
a structured failure so callers can inspect retryability/severity. Modify the
match to return a ConnectionOutcome variant that carries the original WsError or
a new ConnectionFailure struct (e.g., change ConnectionOutcome::Failed(String)
to ConnectionOutcome::Failed(ConnectionFailure) or
ConnectionOutcome::Failed(WsError)), and add a helper
classify_connection_failure(err: WsError) -> ConnectionFailure that maps
WsError::{Http, Io, Tls, Url, Protocol, ...} into typed metadata (retryable,
severity, code). Update call sites that construct or pattern-match
ConnectionOutcome accordingly and use classify_connection_failure(connect_error)
rather than formatting the error to a string; keep connect_with_redirects and
WsError types intact.

Comment thread src/openhuman/socket/ws_loop.rs
…I-9X, -7N)

Per CodeRabbit minor review on PR tinyhumansai#1547: the bare `matches!()` expression
at ws_loop_tests.rs:628 evaluated to a discarded bool — the test would
silently pass if the redirect-without-Location path ever started
returning a different error variant. Wrap in `assert!(...)` so the
expected `WsError::Http(_)` branch is enforced.

`cargo test --lib openhuman::socket::ws_loop` — 27/27 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review — fix(socket): follow HTTP 3xx during WebSocket handshake

Overall: Well-targeted fix for a real Sentry regression. The redirect-following approach is correct, the test coverage is solid (9 new tests + E2E), and the SharedState.error lifecycle is properly wired. A few items below worth addressing before merge — see inline comments.

Verified / looks good

  • SharedState.error behind parking_lot::RwLock — no TOCTOU, correct concurrent access
  • resolve_redirect_target uses url::Url::join for RFC-compliant relative Location handling
  • is_redirect_status correctly limits to 301/302/307/308
  • extract_location_header gracefully drops non-UTF-8 values
  • error field cleared on both connect() and disconnect() — no stale errors
  • All new tests are deterministic (local TCP listeners, no real network)
  • MAX_REDIRECT_HOPS = 3 is sensible for the documented Cloudflare single-hop case

next_url
);
record_redirect_warning(shared, &original, &next_url);
*ws_url = next_url;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] 302/307 redirects should not pin the resolved URL.

*ws_url = next_url runs for ALL redirect types (301/302/307/308), but RFC semantics say 302/307 are temporary — the client should re-resolve each attempt. Only 301/308 should mutate ws_url in place.

This also interacts with the CodeRabbit comment about only emitting the stale-BACKEND_URL warning for permanent redirects — both the URL pinning and the warning should be gated on permanent status codes.

// suggestion: only pin permanent redirects
let is_permanent = matches!(
    response.status(),
    StatusCode::MOVED_PERMANENTLY | StatusCode::PERMANENT_REDIRECT
);
if is_permanent {
    *ws_url = next_url.clone();
}
record_redirect_warning(shared, &original, &next_url);
// For 302/307, use next_url for this hop but do not mutate ws_url

break;
}
tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[major] Sleep-loop polling can produce false failures under load.

If the mock server takes longer than 2.5s (50 × 50ms), the loop exits without hitting Connected and the assert fires a confusing failure. Better to assert inside the loop:

let deadline = tokio::time::Instant::now() + tokio::time::Duration::from_secs(5);
loop {
    if *shared.status.read() == ConnectionStatus::Connected {
        break;
    }
    assert!(
        tokio::time::Instant::now() < deadline,
        "timed out waiting for ConnectionStatus::Connected after redirect follow"
    );
    tokio::time::sleep(tokio::time::Duration::from_millis(50)).await;
}

"[socket] Exceeded {MAX_REDIRECT_HOPS} redirect hops starting from {original}; giving up"
);
return Err(WsError::Http(response));
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] unreachable!() correctness is non-obvious to readers.

The reasoning is correct (the hop == MAX_REDIRECT_HOPS guard returns before the loop exits), but a brief inline proof would help future editors:

// The for-loop covers 0..=MAX_REDIRECT_HOPS (inclusive). On the final
// iteration (hop == MAX_REDIRECT_HOPS) the redirect branch returns Err
// above. Non-redirect errors and Ok(_) also return immediately.
unreachable!("connect_with_redirects exited loop without returning")

shared: &Arc<SharedState>,
) -> Result<WsStream, WsError> {
let original = ws_url.clone();
for hop in 0..=MAX_REDIRECT_HOPS {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] No debug log on successful redirect-follow path.

Every error/warn path logs, but when connect_async succeeds after following hops, there is no confirmation that we connected to the resolved URL. Makes "why did the socket connect to a different host than configured" hard to diagnose.

Ok((stream, _response)) => {
    if *ws_url != original {
        log::debug!(
            "[socket] Connected to redirect target {} (original: {})",
            ws_url, original
        );
    }
    return Ok(stream);
}

oxoxDev and others added 2 commits May 12, 2026 22:37
…cts (OPENHUMAN-TAURI-9X)

Per CodeRabbit major review on PR tinyhumansai#1547: the previous code called
record_redirect_warning unconditionally for every followed redirect
(301/302/307/308). The warning instructs users to update their
BACKEND_URL config, which is correct for 301 / 308 (Permanent /
Permanent Redirect) but misleading for 302 / 307 (Found / Temporary
Redirect) — those say "this time, route elsewhere" and don't imply the
configured URL is stale.

Gate `record_redirect_warning` on the permanent statuses only. The
inline `log::warn!` that records the redirect itself still fires for
all four statuses so the operator can still see what happened.

`cargo test --lib openhuman::socket` — 70/70 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@senamakel senamakel merged commit 29f4b02 into tinyhumansai:main May 13, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants