Skip to content

Fix/docker reconnect churn#234

Merged
yusing merged 10 commits into
mainfrom
fix/docker-reconnect-churn
May 25, 2026
Merged

Fix/docker reconnect churn#234
yusing merged 10 commits into
mainfrom
fix/docker-reconnect-churn

Conversation

@yusing

@yusing yusing commented May 24, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • New Features

    • Bundled Cloudflare CIDR seed, weekly snapshot workflow, and an update script
    • Asynchronous cached Cloudflare CIDR refresh to avoid blocking requests
  • Bug Fixes

    • More robust Docker watcher reconnection and health-check handling
    • Hardened real-IP parsing/validation to prevent invalid overrides
    • Health monitor startup now respects cancellation during jitter delay
  • Tests

    • Expanded unit tests for Cloudflare CIDR logic, Docker watcher, and health checks
  • Chores

    • Updated ignore patterns and adjusted test Makefile targets

Review Change Stack

yusing added 5 commits May 23, 2026 22:26
Extract docker inspect behind an overridable helper for health check
tests and add coverage for stopped container states, missing
health data, and inspect failures.
Keep the post-check random delay cancellation-aware so obsolete
health monitors exit promptly during forced reloads.

Add coverage for cancelling during the jitter window and assert the
monitor stops before the next ticker cycle begins.
Refactor docker watcher reconnects to build a fresh client on each
retry and close the previous client before reconnecting.

Add coverage for closed stream handling, factory retry errors, and
reconnect-triggered reloads after event stream EOF.
Bundle a generated Cloudflare CIDR snapshot for cold starts and use it
to serve CF-Connecting-IP requests without blocking on remote fetches.

Add a CIDR generator and Makefile target, refresh cached ranges in the
background with retry backoff, and cover the new seeding and refresh
behavior in middleware tests.
@coderabbitai

coderabbitai Bot commented May 24, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7bcfbdf7-5b1e-4240-af62-cb594e38591e

📥 Commits

Reviewing files that changed from the base of the PR and between 29cce78 and 27f9c6e.

📒 Files selected for processing (2)
  • .github/workflows/cloudflare-cidrs.yml
  • scripts/update-cloudflare-cidrs.sh

📝 Walkthrough

Walkthrough

This PR seeds and auto-updates Cloudflare CIDRs, refactors real-IP middleware to use a seeded cached CIDR set with async refresh and hardened parsing, extracts Docker health-check inspect handling into a helper, makes health monitor startup jitter cancellation-aware, refactors Docker watcher client/reconnect logic with injectable hooks, and adds tests and CI automation.

Changes

Cloudflare CIDR Generation and Real-IP Middleware Refactoring

Layer / File(s) Summary
Build system integration and CIDR seed embedding
.gitignore, Makefile, internal/net/gphttp/middleware/cloudflare_real_ip_seed.go, internal/net/gphttp/middleware/cloudflare_real_ip_seed.txt, .github/workflows/cloudflare-cidrs.yml, scripts/update-cloudflare-cidrs.sh
Adds embedded Cloudflare CIDR seed, updates .gitignore patterns, adjusts Makefile test/race targets, adds a workflow to refresh the seed weekly or on-demand, and a script to fetch and validate Cloudflare CIDR snapshots.
Real-IP middleware cache and async refresh
internal/net/gphttp/middleware/cloudflare_real_ip.go
Replaces synchronous fetch-with-mutex with a seeded cache and background refresh: freshness checks, atomic refresh guard, mutex-serialized update, IPv4/IPv6 fetch+parse helpers, clone-before-store, and retry backoff on failure.
Real-IP middleware IP validation and header parsing
internal/net/gphttp/middleware/real_ip.go
Tightens CIDR/IP checks (reject nil IPs), parses RemoteAddr host/port, validates host against trusted CIDRs before modifying, trims/normalizes header values, and preserves original port when applying the selected real IP.
Real-IP middleware comprehensive tests
internal/net/gphttp/middleware/cloudflare_real_ip_test.go
Tests for missing/invalid headers, cold-start with bundled CIDRs, background refresh without blocking, non-blocking behavior when refresh goroutine is stuck, failure backoff and retry, and parsing the embedded seed; includes helpers to reset global state.
Real-IP validation edge cases
internal/net/gphttp/middleware/real_ip_test.go
Tests ensure invalid RemoteAddr or invalid header values do not change the request and return OK.

Docker Health Checks and Monitor Jitter

Layer / File(s) Summary
Docker health check inspection helper
internal/health/check/docker.go
Extracts ContainerInspect interception/error handling and container.State extraction into unexported dockerHealthInspect, and integrates it into the Docker health check flow.
Docker health check tests
internal/health/check/docker_test.go
Table-driven and specific tests cover stopped states (unhealthy), healthy path with health output and latency, fallback when Docker health info missing, threshold/guard failures, and inspect-error propagation.
Health monitor cancellation-aware jitter delay
internal/health/monitor/monitor.go
Adds healthMonitorJitterDelay generator and replaces unconditional time.Sleep with a context-aware wait so monitor startup can be canceled promptly.
Health monitor jitter cancellation tests
internal/health/monitor/monitor_test.go
Verifies Start exits quickly when canceled during the jitter delay and that no extra periodic checks run after Finish.

Docker Watcher Client Refactoring

Layer / File(s) Summary
Docker watcher client factory and reconnection
internal/watcher/docker_watcher.go
Introduces newDockerWatcherClient and dockerWatcherCheckConnection hooks, uses factory in EventsWithOptions, guards deferred client close, and adds reconnectDockerWatcherClient implementing a retry loop with context cancellation, connectivity checks, and non-blocking error reporting.
Docker watcher reconnection tests
internal/watcher/docker_watcher_test.go
Tests for event-stream closure detection, reconnect client creation retries, factory-failure reporting via errCh, and end-to-end reconnection after events endpoint EOF using an httptest server.
Middleware test infrastructure update
internal/net/gphttp/middleware/test_utils_test.go
Adds configurable RemoteAddr in middleware test helpers to ensure accurate reverse-proxy remote address recording.

CI Automation for Cloudflare CIDR Updates

Layer / File(s) Summary
Cloudflare CIDR fetch and snapshot script
scripts/update-cloudflare-cidrs.sh
POSIX script that fetches Cloudflare IPv4/IPv6 CIDR lists, normalizes output, validates non-empty snapshot, and writes the seed file.
GitHub Actions CIDR update workflow
.github/workflows/cloudflare-cidrs.yml
Scheduled and manual workflow that runs the snapshot script, verifies parsing via a focused Go test, and commits/pushes the updated snapshot when changed.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled seeds of CIDR lines today,

Cached them close so requests don't fray,
Docker checks hum, monitors skip and wake,
Watchers reconnect without a break,
A floppy-eared cheer for tests that play!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title 'Fix/docker reconnect churn' is only partially related to the changeset. While Docker reconnection is addressed in internal/watcher/docker_watcher.go, the changeset includes substantial unrelated changes to Cloudflare CIDR handling, health checks, monitoring, real IP middleware, and CI workflows that dominate the scope. Consider a more comprehensive title that reflects the primary changes, such as 'Refactor Docker reconnection, health checks, Cloudflare CIDR caching, and real IP middleware' or break this into smaller, focused PRs.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/docker-reconnect-churn

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.gitignore:
- Around line 39-40: The .gitignore currently ignores the generated Cloudflare
seed file causing mustLoadSeedCloudflareCIDRs() to panic when the embedded seed
is missing; either stop ignoring the generated file (remove the
internal/net/gphttp/middleware/cloudflare_real_ip_seed_gen.go entry) so the seed
is committed, or modify mustLoadSeedCloudflareCIDRs() in cloudflare_real_ip.go
to handle a missing/empty embedded seed without panicking (e.g., detect empty
data and return a safe default or a clear error) so builds remain deterministic;
refer to the symbols cloudflare_real_ip_seed_gen.go and
mustLoadSeedCloudflareCIDRs() to locate the code to change.

In `@cmd/cloudflare-cidrs-gen/main.go`:
- Around line 61-74: fetchCloudflareCIDRs currently returns on the first
fetchEndpoint error; change it to attempt both endpoints (cfIPv4CIDRsEndpoint
and cfIPv6CIDRsEndpoint), collect any errors keyed/labelled by endpoint (or
"ipv4"/"ipv6"), and only return a combined gperr.Multiline error if one or more
fetches failed; if both succeed concatenate their bodies with a single newline
between them (preserving the existing newline logic) and return the bytes. Use
the existing function names fetchCloudflareCIDRs and fetchEndpoint and construct
the multiline error via gperr.Multiline([]error{...}) or by building
per-endpoint error lines and passing them to gperr.Multiline so callers see both
IPv4 and IPv6 failure details.

In `@internal/health/check/docker_test.go`:
- Around line 15-88: Add a new unit test named
TestDockerHealthcheckReturnsHealthyWhenContainerHealthy that mirrors existing
tests: create a DockerHealthcheckState with client &docker.SharedClient{} and
containerID "test", save and restore dockerHealthInspect via
oldInspect/t.Cleanup, then set dockerHealthInspect to return
container.State{Status: "running", Health: &container.Health{Status:
container.Healthy, Log: []container.HealthcheckResult{{Start: time.Now(), End:
time.Now().Add(100*time.Millisecond), Output: "healthy"}}}} and nil error; call
Docker(context.Background(), state, time.Second) and assert no error,
result.Healthy is true, result.Detail equals "healthy", and result.Latency is
greater than zero. Ensure imports (time) match existing test file.
- Around line 15-88: Add a test that asserts Docker returns
ErrDockerHealthCheckFailedTooManyTimes when the state's numDockerFailures
exceeds dockerFailuresThreshold: create a
TestDockerHealthcheckReturnsErrorAfterThreshold that constructs a
DockerHealthcheckState with client &docker.SharedClient{}, containerID "test",
and numDockerFailures set to dockerFailuresThreshold + 1, then call
Docker(context.Background(), state, time.Second) and require the error is
ErrDockerHealthCheckFailedTooManyTimes; this exercises the early-fail branch in
Docker that checks numDockerFailures.
- Around line 18-26: The test's states table in docker_test.go only includes
"exited", "paused", and "created" but must also cover the other stopped statuses
handled by the production code ("dead", "restarting", "removing"); update the
states slice in the test to add entries for {name: "dead", status: "dead",
detail: "container is dead"}, {name: "restarting", status: "restarting", detail:
"container is restarting"} and {name: "removing", status: "removing", detail:
"container is removing"} so the test exercising the container status handling
(the states variable used in the test loop) verifies these cases as well.

In `@internal/health/check/docker.go`:
- Around line 28-42: The function dockerHealthInspect should wrap returned
errors with gperr to add context: replace the plain return of err when
httputils.AsRequestInterceptedError fails with a gperr-prepended error (e.g.,
gperr.PrependSubject(err, "dockerHealthInspect") or build a new gperr error
around err) and likewise replace errors.New("intercepted error is nil or data is
nil") with a gperr error that includes the same subject (e.g.,
gperr.PrependSubject(gperr.New("intercepted error is nil or data is nil"),
"dockerHealthInspect")); keep the existing checks around
httputils.AsRequestInterceptedError and the interceptedErr.Data cast but return
gperr-wrapped errors instead of plain errors.
- Line 41: The current return uses a direct type assertion
interceptedErr.Data.(container.State) which can panic; update the code where
interceptedErr is used (the return in the function that returns
(container.State, error)) to perform a safe comma-ok type assertion: attempt to
cast interceptedErr.Data to container.State using v, ok :=
interceptedErr.Data.(container.State) and if ok return v, nil, otherwise return
a zero-value container.State and an explanatory error (or wrap interceptedErr)
to avoid panic and surface the mismatch.

In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 200-212: fetchCloudflareCIDRs currently reads and parses resp.Body
without checking HTTP status, so non-200 responses become CIDR-parse errors;
update fetchCloudflareCIDRs to validate resp.StatusCode (e.g., require 200 OK)
after http.DefaultClient.Do(req) and before reading resp.Body, and return a
clear error (including status code and status text) when the status is not
success; keep parseCloudflareCIDRs unchanged and only call it after the status
check so parse errors reflect real payload issues.
- Around line 131-142: The refreshCloudflareCIDRs function currently spawns an
unmanaged goroutine; change it to register a subtask on the root task (use
task.RootTask().Subtask or create a parent := task.RootTask() and
parent.Subtask()) and run refreshCloudflareCIDRsLocked inside that subtask so it
participates in the task lifecycle; attach cleanup with subtask.OnFinished() (or
OnCancel()) to call cfCIDRsRefreshing.Store(false) and ensure
cfCIDRsMu.Lock()/Unlock() still surround refreshCloudflareCIDRsLocked as before,
and keep the CompareAndSwap check on cfCIDRsRefreshing to avoid duplicate runs.
- Around line 86-90: The middleware is mutating shared state by assigning cidrs
into cri.realIP.From before calling cri.realIP.before; instead, avoid writing to
the long-lived middleware instance by making a per-request copy of the RealIP
config and passing that to the handler. Concretely, call tryFetchCFCIDR(), and
if non-nil, create a local copy (e.g., local := cri.realIP; local.From = cidrs)
and then invoke local.before(w, r) (or adjust the call to pass the local copy)
so that cri.realIP is never modified on the shared middleware instance.

In `@internal/watcher/docker_watcher.go`:
- Around line 205-224: The reconnectDockerWatcherClient loop can block forever
when sending to errCh (used in EventsWithOptions) if the channel is unbuffered;
change the error send at the client creation failure site so it cannot block the
retry ticker—e.g., perform a non-blocking send or dispatch the error send in a
goroutine instead of a direct blocking write. Update the send at the location
that references errCh inside reconnectDockerWatcherClient (the branch after
newDockerWatcherClient returns an error) to use a select { case errCh <-
fmt.Errorf(...): default: } or go func() { errCh <- fmt.Errorf(...) }() so
retries continue even when errCh is not being drained.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 63edc3d4-3833-4bdd-a3af-332ce8a67699

📥 Commits

Reviewing files that changed from the base of the PR and between 50732b6 and 905a253.

📒 Files selected for processing (13)
  • .gitignore
  • Makefile
  • cmd/cloudflare-cidrs-gen/main.go
  • internal/health/check/docker.go
  • internal/health/check/docker_test.go
  • internal/health/monitor/monitor.go
  • internal/health/monitor/monitor_test.go
  • internal/net/gphttp/middleware/cloudflare_real_ip.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_seed.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_test.go
  • internal/net/gphttp/middleware/test_utils_test.go
  • internal/watcher/docker_watcher.go
  • internal/watcher/docker_watcher_test.go

Comment thread .gitignore Outdated
Comment thread cmd/cloudflare-cidrs-gen/main.go Outdated
Comment on lines +61 to +74
func fetchCloudflareCIDRs() ([]byte, error) {
var all bytes.Buffer
for _, endpoint := range []string{cfIPv4CIDRsEndpoint, cfIPv6CIDRsEndpoint} {
body, err := fetchEndpoint(endpoint)
if err != nil {
return nil, err
}
all.Write(body)
if all.Len() > 0 && all.Bytes()[all.Len()-1] != '\n' {
all.WriteByte('\n')
}
}
return all.Bytes(), nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Locate the file and show the relevant section
ls -la cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '1,140p'

# Find the exact function definition (in case line numbers shifted)
rg -n "func fetchCloudflareCIDRs" -S cmd/cloudflare-cidrs-gen/main.go

# Show fetchEndpoint implementation + nearby helpers
rg -n "func fetchEndpoint" -S cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '140,260p'

# Check how gperr is imported/used in this file
rg -n "gperr\\.|goutils/synk|goutils/errs" cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '1,80p'

# Search for gperr.Multiline usage patterns across the repo
rg -n "gperr\\.Multiline\\(" -S .
rg -n "gperr\\.NewBuilder\\(" -S .
rg -n "gperr\\.PrependSubject\\(" -S .
rg -n "gperr\\.NewGroup\\(" -S .

Repository: yusing/godoxy

Length of output: 200


🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="cmd/cloudflare-cidrs-gen/main.go"
echo "== File =="
ls -la "$FILE"

echo "== fetchCloudflareCIDRs section =="
python3 - <<'PY'
import itertools
path="cmd/cloudflare-cidrs-gen/main.go"
start=40
end=110
with open(path,'r',encoding='utf-8') as f:
    for i,line in enumerate(f, start=1):
        if start <= i <= end:
            print(f"{i:4d} {line.rstrip()}")
        if i > end:
            break
PY

echo "== fetchEndpoint section (nearby) =="
python3 - <<'PY'
import itertools
path="cmd/cloudflare-cidrs-gen/main.go"
# scan for fetchEndpoint and print around it
target="func fetchEndpoint"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for idx,l in enumerate(lines, start=1):
    if target in l:
        start=max(1,idx-20); end=min(len(lines),idx+120)
        for i in range(start,end+1):
            print(f"{i:4d} {lines[i-1]}")
        break
else:
    print("fetchEndpoint not found")
PY

echo "== gperr usage/import in this file =="
rg -n "gperr\\.|goutils/errs|goutils/synk" "$FILE" || true
rg -n "import \\(" "$FILE" -n || true
python3 - <<'PY'
path="cmd/cloudflare-cidrs-gen/main.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# print import block
in_import=False
for i,l in enumerate(lines, start=1):
    if l.strip().startswith("import"):
        in_import=True
    if in_import:
        print(f"{i:4d} {l}")
        if l.strip()==")":
            break
PY

echo "== gperr usage patterns across repo (limited) =="
rg -n "gperr\\.Multiline\\(" -S . | head -n 50 || true
rg -n "gperr\\.NewBuilder\\(" -S . | head -n 50 || true
rg -n "gperr\\.PrependSubject\\(" -S . | head -n 50 || true
rg -n "gperr\\.NewGroup\\(" -S . | head -n 50 || true
rg -n "goutils/errs" -S . | head -n 50 || true

Repository: yusing/godoxy

Length of output: 16951


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== AGENTS.md (gperr guidelines section) =="
rg -n "gperr\\.Multiline\\(\\)|multi attempts|gperr\\.NewBuilder\\(\\)|gperr\\.PrependSubject\\(" AGENTS.md
python3 - <<'PY'
import re
path="AGENTS.md"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# print around gperr guidelines mentions
for i,l in enumerate(lines, start=1):
    if "gperr.Multiline" in l or "gperr.NewBuilder" in l or "gperr.PrependSubject" in l:
        start=max(1,i-8); end=min(len(lines), i+8)
        for j in range(start,end+1):
            print(f"{j:4d} {lines[j-1]}")
        print("-----")
PY

echo "== Example callsite: internal/notif/base.go around line 40 =="
python3 - <<'PY'
path="internal/notif/base.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for i,l in enumerate(lines, start=1):
    if i>=25 and i<=70:
        print(f"{i:4d} {l}")
PY

echo "== Example callsite: internal/acl/matcher.go around line 32 =="
python3 - <<'PY'
path="internal/acl/matcher.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for i,l in enumerate(lines, start=1):
    if i>=1 and i<=90:
        print(f"{i:4d} {l}")
PY

Repository: yusing/godoxy

Length of output: 5784


Aggregate IPv4+IPv6 fetch errors with gperr.Multiline()

fetchCloudflareCIDRs() returns immediately on the first fetchEndpoint() failure (lines 65-67), so you never see both IPv4 and IPv6 error details. Fetch both endpoints, collect both failures, and return a multiline gperr error (labeling each line by endpoint/protocol).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/cloudflare-cidrs-gen/main.go` around lines 61 - 74, fetchCloudflareCIDRs
currently returns on the first fetchEndpoint error; change it to attempt both
endpoints (cfIPv4CIDRsEndpoint and cfIPv6CIDRsEndpoint), collect any errors
keyed/labelled by endpoint (or "ipv4"/"ipv6"), and only return a combined
gperr.Multiline error if one or more fetches failed; if both succeed concatenate
their bodies with a single newline between them (preserving the existing newline
logic) and return the bytes. Use the existing function names
fetchCloudflareCIDRs and fetchEndpoint and construct the multiline error via
gperr.Multiline([]error{...}) or by building per-endpoint error lines and
passing them to gperr.Multiline so callers see both IPv4 and IPv6 failure
details.

Comment thread internal/health/check/docker_test.go
Comment thread internal/health/check/docker_test.go
Comment on lines +28 to +42
dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) {
// the actual inspect response is intercepted and returned as RequestInterceptedError
_, err := client.ContainerInspect(ctx, containerID)

var interceptedErr *httputils.RequestInterceptedError
if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
return container.State{}, err
}

if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
return container.State{}, errors.New("intercepted error is nil or data is nil")
}

return interceptedErr.Data.(container.State), nil
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use gperr for error handling per coding guidelines.

The function returns plain errors without using gperr for nested error handling. Lines 34 and 38 should use gperr.PrependSubject() or similar to add context to errors.

♻️ Proposed refactor to use gperr
 dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) {
 	// the actual inspect response is intercepted and returned as RequestInterceptedError
 	_, err := client.ContainerInspect(ctx, containerID)
 
 	var interceptedErr *httputils.RequestInterceptedError
 	if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
-		return container.State{}, err
+		return container.State{}, gperr.PrependSubject(err, "docker inspect")
 	}
 
 	if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
-		return container.State{}, errors.New("intercepted error is nil or data is nil")
+		return container.State{}, gperr.New("intercepted error is nil or data is nil")
 	}
 
+	state, ok := interceptedErr.Data.(container.State)
+	if !ok {
+		return container.State{}, gperr.New("intercepted data is not container.State")
+	}
-	return interceptedErr.Data.(container.State), nil
+	return state, nil
 }

As per coding guidelines: Use gperr (goutils/errs) for nested error handling: gperr.Multiline() for multi-attempt errors, gperr.NewBuilder() for collecting errors, gperr.NewGroup() + group.Go() for concurrent error collection, and gperr.PrependSubject() to prepend subject identifiers.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) {
// the actual inspect response is intercepted and returned as RequestInterceptedError
_, err := client.ContainerInspect(ctx, containerID)
var interceptedErr *httputils.RequestInterceptedError
if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
return container.State{}, err
}
if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
return container.State{}, errors.New("intercepted error is nil or data is nil")
}
return interceptedErr.Data.(container.State), nil
}
dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) {
// the actual inspect response is intercepted and returned as RequestInterceptedError
_, err := client.ContainerInspect(ctx, containerID)
var interceptedErr *httputils.RequestInterceptedError
if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
return container.State{}, gperr.PrependSubject(err, "docker inspect")
}
if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
return container.State{}, gperr.New("intercepted error is nil or data is nil")
}
state, ok := interceptedErr.Data.(container.State)
if !ok {
return container.State{}, gperr.New("intercepted data is not container.State")
}
return state, nil
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/health/check/docker.go` around lines 28 - 42, The function
dockerHealthInspect should wrap returned errors with gperr to add context:
replace the plain return of err when httputils.AsRequestInterceptedError fails
with a gperr-prepended error (e.g., gperr.PrependSubject(err,
"dockerHealthInspect") or build a new gperr error around err) and likewise
replace errors.New("intercepted error is nil or data is nil") with a gperr error
that includes the same subject (e.g.,
gperr.PrependSubject(gperr.New("intercepted error is nil or data is nil"),
"dockerHealthInspect")); keep the existing checks around
httputils.AsRequestInterceptedError and the interceptedErr.Data cast but return
gperr-wrapped errors instead of plain errors.

return container.State{}, errors.New("intercepted error is nil or data is nil")
}

return interceptedErr.Data.(container.State), nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Add safety check for type assertion to prevent panic.

The type assertion interceptedErr.Data.(container.State) can panic if the intercepted data is not of the expected type. Use the comma-ok idiom to handle this gracefully.

🛡️ Proposed fix to add safe type assertion
-	return interceptedErr.Data.(container.State), nil
+	state, ok := interceptedErr.Data.(container.State)
+	if !ok {
+		return container.State{}, errors.New("intercepted data is not container.State")
+	}
+	return state, nil
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return interceptedErr.Data.(container.State), nil
state, ok := interceptedErr.Data.(container.State)
if !ok {
return container.State{}, errors.New("intercepted data is not container.State")
}
return state, nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/health/check/docker.go` at line 41, The current return uses a direct
type assertion interceptedErr.Data.(container.State) which can panic; update the
code where interceptedErr is used (the return in the function that returns
(container.State, error)) to perform a safe comma-ok type assertion: attempt to
cast interceptedErr.Data to container.State using v, ok :=
interceptedErr.Data.(container.State) and if ok return v, nil, otherwise return
a zero-value container.State and an explanatory error (or wrap interceptedErr)
to avoid panic and surface the mismatch.

Comment thread internal/net/gphttp/middleware/cloudflare_real_ip.go Outdated
Comment on lines +131 to +142
func refreshCloudflareCIDRs() {
if !cfCIDRsRefreshing.CompareAndSwap(false, true) {
return
}
go func() {
defer cfCIDRsRefreshing.Store(false)

if common.IsTest {
cfCIDRs = localCIDRs
cfCIDRsMu.Lock()
defer cfCIDRsMu.Unlock()

refreshCloudflareCIDRsLocked()
}()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Use task.RootTask() (with OnFinished()/OnCancel()) for the Cloudflare CIDR refresh goroutine.

In internal/net/gphttp/middleware/cloudflare_real_ip.go (lines 135-142), refreshCloudflareCIDRs() launches an unmanaged go func(); integrate the refresh worker into the task.RootTask()/parent.Subtask() lifecycle so cancellation and cleanup are handled via OnFinished()/OnCancel() rather than running independently.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 131 - 142,
The refreshCloudflareCIDRs function currently spawns an unmanaged goroutine;
change it to register a subtask on the root task (use task.RootTask().Subtask or
create a parent := task.RootTask() and parent.Subtask()) and run
refreshCloudflareCIDRsLocked inside that subtask so it participates in the task
lifecycle; attach cleanup with subtask.OnFinished() (or OnCancel()) to call
cfCIDRsRefreshing.Store(false) and ensure cfCIDRsMu.Lock()/Unlock() still
surround refreshCloudflareCIDRsLocked as before, and keep the CompareAndSwap
check on cfCIDRsRefreshing to avoid duplicate runs.

Comment thread internal/net/gphttp/middleware/cloudflare_real_ip.go
Comment thread internal/watcher/docker_watcher.go
yusing added 2 commits May 25, 2026 14:01
Treat a missing intercepted inspect response as an error and expand
docker healthcheck coverage for healthy, stopped, missing-health,
and threshold-exceeded cases.

Also avoid mutating shared Cloudflare real IP config during CIDR
refresh, reject non-200 CIDR fetches, and keep docker watcher retry
errors non-blocking when the error channel is full.
Ignore invalid real IP header values and untrusted remote addrs when
resolving client IPs. Preserve the original remote port when a trusted
header updates RemoteAddr.

Add coverage for invalid header inputs and update Cloudflare real IP
tests to assert port-preserving behavior.
@yusing yusing force-pushed the fix/docker-reconnect-churn branch from b4bc80b to 44c3e53 Compare May 25, 2026 06:44
Replace the generated Go seed with an embedded
cloudflare_real_ip_seed.txt snapshot and remove the CIDR generator
command.

Add a script and scheduled GitHub workflow to refresh the snapshot and
verify the embedded seed still parses before committing updates.
@yusing yusing force-pushed the fix/docker-reconnect-churn branch from 44c3e53 to 29cce78 Compare May 25, 2026 06:45
@yusing

yusing commented May 25, 2026

Copy link
Copy Markdown
Owner Author

@coderabbitai

@coderabbitai

coderabbitai Bot commented May 25, 2026

Copy link
Copy Markdown
Contributor

@yusing Sure, I'll review the changes in this PR!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
internal/health/check/docker.go (2)

44-44: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Avoid panic-prone type assertion in inspect result.

Line 44 uses a direct assertion (interceptedErr.Data.(container.State)), which can panic on unexpected interceptor payloads.

🛡️ Proposed fix
-		return interceptedErr.Data.(container.State), nil
+		state, ok := interceptedErr.Data.(container.State)
+		if !ok {
+			return container.State{}, gperr.PrependSubject(
+				gperr.New("intercepted data is not container.State"),
+				"dockerHealthInspect",
+			)
+		}
+		return state, nil
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/health/check/docker.go` at line 44, Replace the panic-prone direct
type assertion on interceptedErr.Data with a safe check: verify interceptedErr
is non-nil, perform a type assertion using the comma-ok form (e.g., v, ok :=
interceptedErr.Data.(container.State)), and if ok return v, nil otherwise return
the zero-value of container.State and a descriptive error; update any callers
expecting an error to propagate this new error. Ensure you reference the
interceptedErr variable, its Data field, and the container.State type when
implementing the fix.

33-41: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Wrap helper errors with gperr subject context.

dockerHealthInspect still returns raw/standard errors on nested paths, which drops subject context and violates project error-handling conventions.

♻️ Proposed fix
 import (
 	"context"
 	"errors"
 	"net/http"
 	"time"
 
 	"github.com/docker/docker/api/types/container"
 	"github.com/yusing/godoxy/internal/docker"
 	"github.com/yusing/godoxy/internal/types"
+	gperr "github.com/yusing/goutils/errs"
 	httputils "github.com/yusing/goutils/http"
 	strutils "github.com/yusing/goutils/strings"
 )
@@
 		var interceptedErr *httputils.RequestInterceptedError
 		if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
 			if err == nil {
-				err = errors.New("inspect response was not intercepted")
+				err = gperr.New("inspect response was not intercepted")
 			}
-			return container.State{}, err
+			return container.State{}, gperr.PrependSubject(err, "dockerHealthInspect")
 		}
 
 		if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
-			return container.State{}, errors.New("intercepted error is nil or data is nil")
+			return container.State{}, gperr.PrependSubject(
+				gperr.New("intercepted error is nil or data is nil"),
+				"dockerHealthInspect",
+			)
 		}

As per coding guidelines, "Use gperr (goutils/errs) for nested error handling ... and gperr.PrependSubject() to prepend subject identifiers."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/health/check/docker.go` around lines 33 - 41, The branch in
dockerHealthInspect that handles non-intercepted or nil-intercepted responses
returns plain errors and must be converted to use gperr with subject context;
replace the raw errors (the errors.New("inspect response was not intercepted")
and errors.New("intercepted error is nil or data is nil")) and any returned err
from httputils.AsRequestInterceptedError handling with gperr-wrapped errors and
call gperr.PrependSubject(err, "docker") (or an appropriate subject like
"dockerHealthInspect") before returning so the subject context is preserved;
ensure you also wrap the existing err variable when err != nil using gperr.Wrap
or gperr.PrependSubject rather than returning a plain error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/cloudflare-cidrs.yml:
- Around line 8-10: Top-level workflow permissions currently grant contents:
write; change the workflow to keep top-level permissions read-only (e.g.,
permissions: contents: read) and move the more-permissive contents: write to
only the job that needs it (the update job) by adding a permissions block under
that job with contents: write; ensure no other jobs retain write scope so only
the update job has elevated permissions.
- Around line 3-7: Add a top-level concurrency block to the workflow to prevent
overlapping runs when the scheduled cron and manual workflow_dispatch both
trigger; update the YAML that currently contains the on: schedule and
workflow_dispatch keys by adding a concurrency: group: "cloudflare-cidrs" (or
similar unique group name) and cancel-in-progress: true so in-progress runs are
cancelled when a new run starts, preventing commit/push races.
- Around line 16-23: Pin the mutable GitHub Action references to immutable
commit SHAs by replacing uses: actions/checkout@v4 and uses: actions/setup-go@v5
with their corresponding full commit SHAs (e.g., uses:
actions/checkout@<FULL_SHA> and uses: actions/setup-go@<FULL_SHA>); locate the
two occurrences in the workflow where those strings appear and swap the tag for
the repo commit SHA while preserving the existing with: blocks (ref: ${{
github.ref_name }} and go-version-file: go.mod). Ensure you fetch the exact
commit SHAs from the official actions/repos before committing.

In `@internal/watcher/docker_watcher.go`:
- Around line 216-219: Replace the fmt.Errorf wrapping sent to errCh with
gperr.PrependSubject to follow repo error-style: call
gperr.PrependSubject("docker watcher", err) (or the equivalent PrependSubject
usage) instead of fmt.Errorf("docker watcher: %w", err) and add an import for
the gperr package; also update the other occurrence that wraps client reinit
errors (the earlier send to errCh) to use gperr.PrependSubject as well so both
places use the same convention (refer to the errCh send sites and the docker
watcher reinitialization logic to locate the changes).

In `@scripts/update-cloudflare-cidrs.sh`:
- Around line 8-14: The script currently runs curl inside a pipeline (the for
loop calling curl then piping to awk) under /bin/sh with set -eu, which allows
curl failures to be masked by awk; change the shebang to a POSIX-compatible bash
(e.g., #!/usr/bin/env bash) and enable strict pipe failure by adding set -euo
pipefail (or set -eu; set -o pipefail) near the top, so any failed curl in the
loop (the curl in the for ... done | awk 'NF { print }' > "$tmp" pipeline) will
cause the script to fail fast and avoid writing a partial "$tmp".

---

Duplicate comments:
In `@internal/health/check/docker.go`:
- Line 44: Replace the panic-prone direct type assertion on interceptedErr.Data
with a safe check: verify interceptedErr is non-nil, perform a type assertion
using the comma-ok form (e.g., v, ok := interceptedErr.Data.(container.State)),
and if ok return v, nil otherwise return the zero-value of container.State and a
descriptive error; update any callers expecting an error to propagate this new
error. Ensure you reference the interceptedErr variable, its Data field, and the
container.State type when implementing the fix.
- Around line 33-41: The branch in dockerHealthInspect that handles
non-intercepted or nil-intercepted responses returns plain errors and must be
converted to use gperr with subject context; replace the raw errors (the
errors.New("inspect response was not intercepted") and errors.New("intercepted
error is nil or data is nil")) and any returned err from
httputils.AsRequestInterceptedError handling with gperr-wrapped errors and call
gperr.PrependSubject(err, "docker") (or an appropriate subject like
"dockerHealthInspect") before returning so the subject context is preserved;
ensure you also wrap the existing err variable when err != nil using gperr.Wrap
or gperr.PrependSubject rather than returning a plain error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fc13e4a5-99a1-4fe1-bdda-21767ba2bc5e

📥 Commits

Reviewing files that changed from the base of the PR and between 905a253 and b4bc80b.

📒 Files selected for processing (11)
  • .github/workflows/cloudflare-cidrs.yml
  • internal/health/check/docker.go
  • internal/health/check/docker_test.go
  • internal/net/gphttp/middleware/cloudflare_real_ip.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_seed.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_seed.txt
  • internal/net/gphttp/middleware/cloudflare_real_ip_test.go
  • internal/net/gphttp/middleware/real_ip.go
  • internal/net/gphttp/middleware/real_ip_test.go
  • internal/watcher/docker_watcher.go
  • scripts/update-cloudflare-cidrs.sh

Comment thread .github/workflows/cloudflare-cidrs.yml
Comment thread .github/workflows/cloudflare-cidrs.yml
Comment thread .github/workflows/cloudflare-cidrs.yml Outdated
Comment thread internal/watcher/docker_watcher.go
Comment thread scripts/update-cloudflare-cidrs.sh Outdated
Add workflow-level concurrency for cloudflare-cidrs updates and pin
checkout/setup-go actions by commit SHA.

Reduce top-level contents permission to read and grant write access
only to the update job.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/net/gphttp/middleware/cloudflare_real_ip.go (1)

57-61: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Use gperr for combined Cloudflare IPv4/IPv6 endpoint errors (instead of errors.Join)

In loadCloudflareCIDRs, the IPv4 and IPv6 fetch errors are merged via errors.Join(ipv4Err, ipv6Err);—switching to the repo’s gperr nested-error patterns (e.g., gperr.NewBuilder() + gperr.PrependSubject() to label the IPv4 vs IPv6 endpoints) would keep error formatting/propagation consistent.

loadCloudflareCIDRs = func() ([]*nettypes.CIDR, error) {
	ipv4CIDRs, ipv4Err := fetchCloudflareCIDRs(cfIPv4CIDRsEndpoint)
	ipv6CIDRs, ipv6Err := fetchCloudflareCIDRs(cfIPv6CIDRsEndpoint)
	if err := errors.Join(ipv4Err, ipv6Err); err != nil {
		return nil, err
	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 57 - 61,
Replace the errors.Join usage in loadCloudflareCIDRs with the repository gperr
pattern: when fetchCloudflareCIDRs returns ipv4Err and ipv6Err, build a combined
error using gperr.NewBuilder(), call gperr.PrependSubject(...) (or equivalent)
to label the ipv4Err with cfIPv4CIDRsEndpoint and the ipv6Err with
cfIPv6CIDRsEndpoint, add both errors to the builder, and return the built gperr
error instead of errors.Join; keep the rest of loadCloudflareCIDRs and uses of
fetchCloudflareCIDRs unchanged.
♻️ Duplicate comments (5)
.github/workflows/cloudflare-cidrs.yml (3)

8-13: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Scope write permission to the update job only.

contents: write at workflow scope is broader than needed. Keep workflow-level permissions read-only and grant write only where commit/push happens.

Suggested fix
 permissions:
-  contents: write
+  contents: read

 jobs:
   update:
+    permissions:
+      contents: write
     runs-on: ubuntu-latest
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/cloudflare-cidrs.yml around lines 8 - 13, Move the broad
workflow-level write permission to the specific job that needs it: change the
top-level permissions from "contents: write" to "contents: read" (or remove if
implicit read) and add "permissions: contents: write" under the "update" job
block so only the update job has write access; update the YAML entries
referencing the top-level permissions and the "update" job to reflect this
scoping.

16-23: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin third-party actions to immutable commit SHAs.

actions/checkout@v4 and actions/setup-go@v5 are mutable tags; pin to full SHAs.

What are the current full commit SHAs for:
1) actions/checkout v4
2) actions/setup-go v5
recommended for GitHub Actions workflow pinning?
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/cloudflare-cidrs.yml around lines 16 - 23, The workflow
uses mutable tags actions/checkout@v4 and actions/setup-go@v5; replace those
with the corresponding immutable commit SHAs from each action's GitHub repo (use
the full 40-character commit SHA for actions/checkout and actions/setup-go). To
get the SHAs, open github.com/actions/checkout and github.com/actions/setup-go,
find the v4 and v5 tags/releases (or use the repo’s refs/tags endpoints) and
copy the full commit SHA for each tag, then substitute the tag references in the
workflow with the full SHAs (e.g., actions/checkout@<full-sha> and
actions/setup-go@<full-sha>).

3-7: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add workflow concurrency to prevent overlapping update runs.

Scheduled and manual runs can overlap and race on commit/push to the same branch.

Suggested fix
 on:
   schedule:
     - cron: "17 3 * * 1"
   workflow_dispatch:

+concurrency:
+  group: cloudflare-cidrs-${{ github.ref }}
+  cancel-in-progress: false
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/cloudflare-cidrs.yml around lines 3 - 7, The workflow can
run overlapping scheduled and manual jobs; add a top-level concurrency stanza
(adjacent to the existing on/schedule/workflow_dispatch keys) to serialize runs
and cancel in-progress duplicates—for example use a unique group like
"cloudflare-cidrs-update-${{ github.ref }}" or "${{ github.workflow }}-${{
github.ref }}" and set cancel-in-progress: true so the existing running job is
cancelled when a new run starts; update the YAML to include this concurrency
block to prevent race/overlap with schedule and workflow_dispatch.
internal/net/gphttp/middleware/cloudflare_real_ip.go (1)

138-149: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

Run CIDR refresh in the task lifecycle, not as an unmanaged goroutine.

The detached goroutine bypasses structured cancellation/cleanup hooks; integrate this worker via task.RootTask()/Subtask() and cleanup via OnFinished()/OnCancel().

As per coding guidelines, **/*.go: "Use task.RootTask() for background operations and parent.Subtask() for nesting, with OnFinished() and OnCancel() for cleanup".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 138 - 149,
The function refreshCloudflareCIDRs currently spawns an unmanaged goroutine;
change it to run as a managed background task using task.RootTask() and create a
subtask (parent.Subtask()) for the refresh so cancellation/cleanup are handled;
keep the existing concurrency guards (cfCIDRsRefreshing, cfCIDRsMu) and call
refreshCloudflareCIDRsLocked() from inside the subtask, and register cleanup via
subtask.OnFinished() and subtask.OnCancel() to ensure cfCIDRsRefreshing is reset
(cfCIDRsRefreshing.Store(false)) and the mutex is unlocked if needed.
scripts/update-cloudflare-cidrs.sh (1)

1-14: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t let pipeline semantics hide curl failures.

With /bin/sh and no pipefail, a failed curl in for ... done | awk ... can still produce exit code 0 from awk.

Suggested fix (POSIX-safe, no bash dependency)
 tmp="$(mktemp)"
-trap 'rm -f "$tmp"' EXIT
+tmp_raw="$(mktemp)"
+trap 'rm -f "$tmp" "$tmp_raw"' EXIT

+: > "$tmp_raw"
 for endpoint in \
 	https://www.cloudflare.com/ips-v4 \
 	https://www.cloudflare.com/ips-v6
 do
-	curl -fsSL "$endpoint"
-	printf '\n'
-done | awk 'NF { print }' > "$tmp"
+	curl -fsSL "$endpoint" >> "$tmp_raw"
+	printf '\n' >> "$tmp_raw"
+done
+
+awk 'NF { print }' "$tmp_raw" > "$tmp"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/update-cloudflare-cidrs.sh` around lines 1 - 14, The script currently
pipes multiple curl calls into awk so a failing curl can be masked; change the
loop to call curl for each endpoint and append its output (and a separating
newline from printf) into the temporary file "$tmp", immediately check curl's
exit status after each invocation and exit non‑zero (with an error message) if
any curl fails, then run the awk 'NF { print }' filter on "$tmp" and write the
result to "$out"; reference the loop variable "endpoint", the temp file "tmp",
and the output variable "out" when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Makefile`:
- Around line 146-150: Update the Makefile help text so it accurately documents
the test targets: change the description that currently claims `test` runs with
`-race` to reflect that `test` runs without `-race`, and add an entry for the
new `race-test` target; update any lines under the `help` target that reference
`test`/`-race` to mention `race-test` explicitly and clarify which target runs
with the race detector (i.e., `race-test` uses `-race` while `test` does not).

---

Outside diff comments:
In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 57-61: Replace the errors.Join usage in loadCloudflareCIDRs with
the repository gperr pattern: when fetchCloudflareCIDRs returns ipv4Err and
ipv6Err, build a combined error using gperr.NewBuilder(), call
gperr.PrependSubject(...) (or equivalent) to label the ipv4Err with
cfIPv4CIDRsEndpoint and the ipv6Err with cfIPv6CIDRsEndpoint, add both errors to
the builder, and return the built gperr error instead of errors.Join; keep the
rest of loadCloudflareCIDRs and uses of fetchCloudflareCIDRs unchanged.

---

Duplicate comments:
In @.github/workflows/cloudflare-cidrs.yml:
- Around line 8-13: Move the broad workflow-level write permission to the
specific job that needs it: change the top-level permissions from "contents:
write" to "contents: read" (or remove if implicit read) and add "permissions:
contents: write" under the "update" job block so only the update job has write
access; update the YAML entries referencing the top-level permissions and the
"update" job to reflect this scoping.
- Around line 16-23: The workflow uses mutable tags actions/checkout@v4 and
actions/setup-go@v5; replace those with the corresponding immutable commit SHAs
from each action's GitHub repo (use the full 40-character commit SHA for
actions/checkout and actions/setup-go). To get the SHAs, open
github.com/actions/checkout and github.com/actions/setup-go, find the v4 and v5
tags/releases (or use the repo’s refs/tags endpoints) and copy the full commit
SHA for each tag, then substitute the tag references in the workflow with the
full SHAs (e.g., actions/checkout@<full-sha> and actions/setup-go@<full-sha>).
- Around line 3-7: The workflow can run overlapping scheduled and manual jobs;
add a top-level concurrency stanza (adjacent to the existing
on/schedule/workflow_dispatch keys) to serialize runs and cancel in-progress
duplicates—for example use a unique group like "cloudflare-cidrs-update-${{
github.ref }}" or "${{ github.workflow }}-${{ github.ref }}" and set
cancel-in-progress: true so the existing running job is cancelled when a new run
starts; update the YAML to include this concurrency block to prevent
race/overlap with schedule and workflow_dispatch.

In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 138-149: The function refreshCloudflareCIDRs currently spawns an
unmanaged goroutine; change it to run as a managed background task using
task.RootTask() and create a subtask (parent.Subtask()) for the refresh so
cancellation/cleanup are handled; keep the existing concurrency guards
(cfCIDRsRefreshing, cfCIDRsMu) and call refreshCloudflareCIDRsLocked() from
inside the subtask, and register cleanup via subtask.OnFinished() and
subtask.OnCancel() to ensure cfCIDRsRefreshing is reset
(cfCIDRsRefreshing.Store(false)) and the mutex is unlocked if needed.

In `@scripts/update-cloudflare-cidrs.sh`:
- Around line 1-14: The script currently pipes multiple curl calls into awk so a
failing curl can be masked; change the loop to call curl for each endpoint and
append its output (and a separating newline from printf) into the temporary file
"$tmp", immediately check curl's exit status after each invocation and exit
non‑zero (with an error message) if any curl fails, then run the awk 'NF { print
}' filter on "$tmp" and write the result to "$out"; reference the loop variable
"endpoint", the temp file "tmp", and the output variable "out" when making these
changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2c833752-6658-44fa-86ce-0a2ef1a3be8f

📥 Commits

Reviewing files that changed from the base of the PR and between b4bc80b and 29cce78.

📒 Files selected for processing (8)
  • .github/workflows/cloudflare-cidrs.yml
  • .gitignore
  • Makefile
  • internal/net/gphttp/middleware/cloudflare_real_ip.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_seed.go
  • internal/net/gphttp/middleware/cloudflare_real_ip_seed.txt
  • internal/net/gphttp/middleware/cloudflare_real_ip_test.go
  • scripts/update-cloudflare-cidrs.sh
💤 Files with no reviewable changes (1)
  • .gitignore

Comment thread Makefile
Comment on lines 146 to 150
test: ensure-webui-dist
go test -v ${BUILD_FLAGS} ./internal/...

race-test: ensure-webui-dist
CGO_ENABLED=1 go test -v -race ${BUILD_FLAGS} ./internal/...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update help output to match the new test targets.

test no longer runs with -race, but help still documents it that way and doesn’t mention race-test.

Suggested fix
-		'  test             go test -race internal/...' \
+		'  test             go test internal/...' \
+		'  race-test        go test -race internal/...' \
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Makefile` around lines 146 - 150, Update the Makefile help text so it
accurately documents the test targets: change the description that currently
claims `test` runs with `-race` to reflect that `test` runs without `-race`, and
add an entry for the new `race-test` target; update any lines under the `help`
target that reference `test`/`-race` to mention `race-test` explicitly and
clarify which target runs with the race detector (i.e., `race-test` uses `-race`
while `test` does not).

Switch the updater to bash with pipefail and stage curl output
through a raw temp file before filtering blank lines.

This keeps cleanup explicit for both temp files and avoids relying
on a pipeline to build the final snapshot.
@yusing yusing closed this May 25, 2026
@yusing yusing reopened this May 25, 2026
@sonarqubecloud

Copy link
Copy Markdown

@yusing yusing merged commit e5f8bce into main May 25, 2026
26 checks passed
@yusing yusing deleted the fix/docker-reconnect-churn branch May 25, 2026 08:00
yusing added a commit that referenced this pull request May 25, 2026
Recreate Docker watcher clients after stream disconnects, factor docker
health inspect handling for tests, and make health monitor jitter delay
cancellation-aware. Add coverage for stopped container states, inspect
failures, monitor teardown, and watcher reconnect behavior.

Seed Cloudflare real IP CIDRs from an embedded snapshot, refresh them
asynchronously with retry backoff, and avoid fetching when the
CF-Connecting-IP header is absent. Harden real IP parsing to ignore
invalid values and preserve the original remote port when rewriting the
client address.

Add the Cloudflare CIDR snapshot seed, update script, and scheduled
workflow to refresh it, verify the embedded data in tests, scope
workflow write permissions, add a non-race internal test target, and
fix repo-root ignore patterns.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant