Feature flag evaluation request has no retry on transient network errors

The feature flag evaluation request (`PostHog::FeatureFlagsPoller#_request`, used by `_request_feature_flag_evaluation` for `POST /flags/?v=2`) has no retry. On any transient network error it rescues and immediately re-raises, surfacing a hard error to the caller.

By contrast, the event-capture path (`PostHog::Transport`) already retries with backoff (`retry_with_backoff`). The flags path is the inconsistent one.

### Why this matters

The flags request:
- opens a **fresh connection per request** (`Net::HTTP.start { ... }`, no keep-alive),
- has a **3s default timeout** (`feature_flag_request_timeout_seconds`),
- and **does not retry**.

So a single transient stall anywhere in the network path (packet loss + TCP retransmit, TLS setup jitter, an edge/proxy hiccup) that exceeds the 3s budget becomes a customer-visible `Net::ReadTimeout`, even when the PostHog flags service itself is healthy and responding in single-digit milliseconds.

We investigated a report of intermittent `Net::ReadTimeout` bursts against `POST /flags/?v=2`. Server-side the flags service was healthy throughout (fast 2xx responses, no deploy or pod churn during the window), and the gateway logged the customer's traffic as 2xx. The failures had no server-side trace, consistent with transient loss in the client → CDN → gateway path that a retry would have absorbed.

The error message the user sees comes from this path: `Unable to complete request to https://<host>/flags/?v=2`.

### Proposed fix

Add a bounded retry (e.g. 1 attempt with small backoff) to `_request` (or just the flag evaluation call) for idempotent transient errors that are already rescued there:

- `Net::ReadTimeout`
- `Net::OpenTimeout` / `Timeout::Error`
- `Errno::ECONNRESET`
- `EOFError`

Flag evaluation is side-effect-free, so retrying is safe. The existing `BackoffPolicy` used by `Transport` could be reused.

### Notes
- Still reproduces on `main` (no retry on `_request`). Recent versions did add `open_timeout`/`write_timeout`, but not retry.
- Affected user was on `3.5.4`; upgrading helps with the timeout settings but does not add retry.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature flag evaluation request has no retry on transient network errors #195

Why this matters

Proposed fix

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Feature flag evaluation request has no retry on transient network errors #195

Description

Why this matters

Proposed fix

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions