Fix/docker reconnect churn#234
Conversation
Extract docker inspect behind an overridable helper for health check tests and add coverage for stopped container states, missing health data, and inspect failures.
Keep the post-check random delay cancellation-aware so obsolete health monitors exit promptly during forced reloads. Add coverage for cancelling during the jitter window and assert the monitor stops before the next ticker cycle begins.
Refactor docker watcher reconnects to build a fresh client on each retry and close the previous client before reconnecting. Add coverage for closed stream handling, factory retry errors, and reconnect-triggered reloads after event stream EOF.
Bundle a generated Cloudflare CIDR snapshot for cold starts and use it to serve CF-Connecting-IP requests without blocking on remote fetches. Add a CIDR generator and Makefile target, refresh cached ranges in the background with retry backoff, and cover the new seeding and refresh behavior in middleware tests.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR seeds and auto-updates Cloudflare CIDRs, refactors real-IP middleware to use a seeded cached CIDR set with async refresh and hardened parsing, extracts Docker health-check inspect handling into a helper, makes health monitor startup jitter cancellation-aware, refactors Docker watcher client/reconnect logic with injectable hooks, and adds tests and CI automation. ChangesCloudflare CIDR Generation and Real-IP Middleware Refactoring
Docker Health Checks and Monitor Jitter
Docker Watcher Client Refactoring
CI Automation for Cloudflare CIDR Updates
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.gitignore:
- Around line 39-40: The .gitignore currently ignores the generated Cloudflare
seed file causing mustLoadSeedCloudflareCIDRs() to panic when the embedded seed
is missing; either stop ignoring the generated file (remove the
internal/net/gphttp/middleware/cloudflare_real_ip_seed_gen.go entry) so the seed
is committed, or modify mustLoadSeedCloudflareCIDRs() in cloudflare_real_ip.go
to handle a missing/empty embedded seed without panicking (e.g., detect empty
data and return a safe default or a clear error) so builds remain deterministic;
refer to the symbols cloudflare_real_ip_seed_gen.go and
mustLoadSeedCloudflareCIDRs() to locate the code to change.
In `@cmd/cloudflare-cidrs-gen/main.go`:
- Around line 61-74: fetchCloudflareCIDRs currently returns on the first
fetchEndpoint error; change it to attempt both endpoints (cfIPv4CIDRsEndpoint
and cfIPv6CIDRsEndpoint), collect any errors keyed/labelled by endpoint (or
"ipv4"/"ipv6"), and only return a combined gperr.Multiline error if one or more
fetches failed; if both succeed concatenate their bodies with a single newline
between them (preserving the existing newline logic) and return the bytes. Use
the existing function names fetchCloudflareCIDRs and fetchEndpoint and construct
the multiline error via gperr.Multiline([]error{...}) or by building
per-endpoint error lines and passing them to gperr.Multiline so callers see both
IPv4 and IPv6 failure details.
In `@internal/health/check/docker_test.go`:
- Around line 15-88: Add a new unit test named
TestDockerHealthcheckReturnsHealthyWhenContainerHealthy that mirrors existing
tests: create a DockerHealthcheckState with client &docker.SharedClient{} and
containerID "test", save and restore dockerHealthInspect via
oldInspect/t.Cleanup, then set dockerHealthInspect to return
container.State{Status: "running", Health: &container.Health{Status:
container.Healthy, Log: []container.HealthcheckResult{{Start: time.Now(), End:
time.Now().Add(100*time.Millisecond), Output: "healthy"}}}} and nil error; call
Docker(context.Background(), state, time.Second) and assert no error,
result.Healthy is true, result.Detail equals "healthy", and result.Latency is
greater than zero. Ensure imports (time) match existing test file.
- Around line 15-88: Add a test that asserts Docker returns
ErrDockerHealthCheckFailedTooManyTimes when the state's numDockerFailures
exceeds dockerFailuresThreshold: create a
TestDockerHealthcheckReturnsErrorAfterThreshold that constructs a
DockerHealthcheckState with client &docker.SharedClient{}, containerID "test",
and numDockerFailures set to dockerFailuresThreshold + 1, then call
Docker(context.Background(), state, time.Second) and require the error is
ErrDockerHealthCheckFailedTooManyTimes; this exercises the early-fail branch in
Docker that checks numDockerFailures.
- Around line 18-26: The test's states table in docker_test.go only includes
"exited", "paused", and "created" but must also cover the other stopped statuses
handled by the production code ("dead", "restarting", "removing"); update the
states slice in the test to add entries for {name: "dead", status: "dead",
detail: "container is dead"}, {name: "restarting", status: "restarting", detail:
"container is restarting"} and {name: "removing", status: "removing", detail:
"container is removing"} so the test exercising the container status handling
(the states variable used in the test loop) verifies these cases as well.
In `@internal/health/check/docker.go`:
- Around line 28-42: The function dockerHealthInspect should wrap returned
errors with gperr to add context: replace the plain return of err when
httputils.AsRequestInterceptedError fails with a gperr-prepended error (e.g.,
gperr.PrependSubject(err, "dockerHealthInspect") or build a new gperr error
around err) and likewise replace errors.New("intercepted error is nil or data is
nil") with a gperr error that includes the same subject (e.g.,
gperr.PrependSubject(gperr.New("intercepted error is nil or data is nil"),
"dockerHealthInspect")); keep the existing checks around
httputils.AsRequestInterceptedError and the interceptedErr.Data cast but return
gperr-wrapped errors instead of plain errors.
- Line 41: The current return uses a direct type assertion
interceptedErr.Data.(container.State) which can panic; update the code where
interceptedErr is used (the return in the function that returns
(container.State, error)) to perform a safe comma-ok type assertion: attempt to
cast interceptedErr.Data to container.State using v, ok :=
interceptedErr.Data.(container.State) and if ok return v, nil, otherwise return
a zero-value container.State and an explanatory error (or wrap interceptedErr)
to avoid panic and surface the mismatch.
In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 200-212: fetchCloudflareCIDRs currently reads and parses resp.Body
without checking HTTP status, so non-200 responses become CIDR-parse errors;
update fetchCloudflareCIDRs to validate resp.StatusCode (e.g., require 200 OK)
after http.DefaultClient.Do(req) and before reading resp.Body, and return a
clear error (including status code and status text) when the status is not
success; keep parseCloudflareCIDRs unchanged and only call it after the status
check so parse errors reflect real payload issues.
- Around line 131-142: The refreshCloudflareCIDRs function currently spawns an
unmanaged goroutine; change it to register a subtask on the root task (use
task.RootTask().Subtask or create a parent := task.RootTask() and
parent.Subtask()) and run refreshCloudflareCIDRsLocked inside that subtask so it
participates in the task lifecycle; attach cleanup with subtask.OnFinished() (or
OnCancel()) to call cfCIDRsRefreshing.Store(false) and ensure
cfCIDRsMu.Lock()/Unlock() still surround refreshCloudflareCIDRsLocked as before,
and keep the CompareAndSwap check on cfCIDRsRefreshing to avoid duplicate runs.
- Around line 86-90: The middleware is mutating shared state by assigning cidrs
into cri.realIP.From before calling cri.realIP.before; instead, avoid writing to
the long-lived middleware instance by making a per-request copy of the RealIP
config and passing that to the handler. Concretely, call tryFetchCFCIDR(), and
if non-nil, create a local copy (e.g., local := cri.realIP; local.From = cidrs)
and then invoke local.before(w, r) (or adjust the call to pass the local copy)
so that cri.realIP is never modified on the shared middleware instance.
In `@internal/watcher/docker_watcher.go`:
- Around line 205-224: The reconnectDockerWatcherClient loop can block forever
when sending to errCh (used in EventsWithOptions) if the channel is unbuffered;
change the error send at the client creation failure site so it cannot block the
retry ticker—e.g., perform a non-blocking send or dispatch the error send in a
goroutine instead of a direct blocking write. Update the send at the location
that references errCh inside reconnectDockerWatcherClient (the branch after
newDockerWatcherClient returns an error) to use a select { case errCh <-
fmt.Errorf(...): default: } or go func() { errCh <- fmt.Errorf(...) }() so
retries continue even when errCh is not being drained.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 63edc3d4-3833-4bdd-a3af-332ce8a67699
📒 Files selected for processing (13)
.gitignoreMakefilecmd/cloudflare-cidrs-gen/main.gointernal/health/check/docker.gointernal/health/check/docker_test.gointernal/health/monitor/monitor.gointernal/health/monitor/monitor_test.gointernal/net/gphttp/middleware/cloudflare_real_ip.gointernal/net/gphttp/middleware/cloudflare_real_ip_seed.gointernal/net/gphttp/middleware/cloudflare_real_ip_test.gointernal/net/gphttp/middleware/test_utils_test.gointernal/watcher/docker_watcher.gointernal/watcher/docker_watcher_test.go
| func fetchCloudflareCIDRs() ([]byte, error) { | ||
| var all bytes.Buffer | ||
| for _, endpoint := range []string{cfIPv4CIDRsEndpoint, cfIPv6CIDRsEndpoint} { | ||
| body, err := fetchEndpoint(endpoint) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| all.Write(body) | ||
| if all.Len() > 0 && all.Bytes()[all.Len()-1] != '\n' { | ||
| all.WriteByte('\n') | ||
| } | ||
| } | ||
| return all.Bytes(), nil | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the file and show the relevant section
ls -la cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '1,140p'
# Find the exact function definition (in case line numbers shifted)
rg -n "func fetchCloudflareCIDRs" -S cmd/cloudflare-cidrs-gen/main.go
# Show fetchEndpoint implementation + nearby helpers
rg -n "func fetchEndpoint" -S cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '140,260p'
# Check how gperr is imported/used in this file
rg -n "gperr\\.|goutils/synk|goutils/errs" cmd/cloudflare-cidrs-gen/main.go
nl -ba cmd/cloudflare-cidrs-gen/main.go | sed -n '1,80p'
# Search for gperr.Multiline usage patterns across the repo
rg -n "gperr\\.Multiline\\(" -S .
rg -n "gperr\\.NewBuilder\\(" -S .
rg -n "gperr\\.PrependSubject\\(" -S .
rg -n "gperr\\.NewGroup\\(" -S .Repository: yusing/godoxy
Length of output: 200
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="cmd/cloudflare-cidrs-gen/main.go"
echo "== File =="
ls -la "$FILE"
echo "== fetchCloudflareCIDRs section =="
python3 - <<'PY'
import itertools
path="cmd/cloudflare-cidrs-gen/main.go"
start=40
end=110
with open(path,'r',encoding='utf-8') as f:
for i,line in enumerate(f, start=1):
if start <= i <= end:
print(f"{i:4d} {line.rstrip()}")
if i > end:
break
PY
echo "== fetchEndpoint section (nearby) =="
python3 - <<'PY'
import itertools
path="cmd/cloudflare-cidrs-gen/main.go"
# scan for fetchEndpoint and print around it
target="func fetchEndpoint"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for idx,l in enumerate(lines, start=1):
if target in l:
start=max(1,idx-20); end=min(len(lines),idx+120)
for i in range(start,end+1):
print(f"{i:4d} {lines[i-1]}")
break
else:
print("fetchEndpoint not found")
PY
echo "== gperr usage/import in this file =="
rg -n "gperr\\.|goutils/errs|goutils/synk" "$FILE" || true
rg -n "import \\(" "$FILE" -n || true
python3 - <<'PY'
path="cmd/cloudflare-cidrs-gen/main.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# print import block
in_import=False
for i,l in enumerate(lines, start=1):
if l.strip().startswith("import"):
in_import=True
if in_import:
print(f"{i:4d} {l}")
if l.strip()==")":
break
PY
echo "== gperr usage patterns across repo (limited) =="
rg -n "gperr\\.Multiline\\(" -S . | head -n 50 || true
rg -n "gperr\\.NewBuilder\\(" -S . | head -n 50 || true
rg -n "gperr\\.PrependSubject\\(" -S . | head -n 50 || true
rg -n "gperr\\.NewGroup\\(" -S . | head -n 50 || true
rg -n "goutils/errs" -S . | head -n 50 || trueRepository: yusing/godoxy
Length of output: 16951
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== AGENTS.md (gperr guidelines section) =="
rg -n "gperr\\.Multiline\\(\\)|multi attempts|gperr\\.NewBuilder\\(\\)|gperr\\.PrependSubject\\(" AGENTS.md
python3 - <<'PY'
import re
path="AGENTS.md"
lines=open(path,'r',encoding='utf-8').read().splitlines()
# print around gperr guidelines mentions
for i,l in enumerate(lines, start=1):
if "gperr.Multiline" in l or "gperr.NewBuilder" in l or "gperr.PrependSubject" in l:
start=max(1,i-8); end=min(len(lines), i+8)
for j in range(start,end+1):
print(f"{j:4d} {lines[j-1]}")
print("-----")
PY
echo "== Example callsite: internal/notif/base.go around line 40 =="
python3 - <<'PY'
path="internal/notif/base.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for i,l in enumerate(lines, start=1):
if i>=25 and i<=70:
print(f"{i:4d} {l}")
PY
echo "== Example callsite: internal/acl/matcher.go around line 32 =="
python3 - <<'PY'
path="internal/acl/matcher.go"
lines=open(path,'r',encoding='utf-8').read().splitlines()
for i,l in enumerate(lines, start=1):
if i>=1 and i<=90:
print(f"{i:4d} {l}")
PYRepository: yusing/godoxy
Length of output: 5784
Aggregate IPv4+IPv6 fetch errors with gperr.Multiline()
fetchCloudflareCIDRs() returns immediately on the first fetchEndpoint() failure (lines 65-67), so you never see both IPv4 and IPv6 error details. Fetch both endpoints, collect both failures, and return a multiline gperr error (labeling each line by endpoint/protocol).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/cloudflare-cidrs-gen/main.go` around lines 61 - 74, fetchCloudflareCIDRs
currently returns on the first fetchEndpoint error; change it to attempt both
endpoints (cfIPv4CIDRsEndpoint and cfIPv6CIDRsEndpoint), collect any errors
keyed/labelled by endpoint (or "ipv4"/"ipv6"), and only return a combined
gperr.Multiline error if one or more fetches failed; if both succeed concatenate
their bodies with a single newline between them (preserving the existing newline
logic) and return the bytes. Use the existing function names
fetchCloudflareCIDRs and fetchEndpoint and construct the multiline error via
gperr.Multiline([]error{...}) or by building per-endpoint error lines and
passing them to gperr.Multiline so callers see both IPv4 and IPv6 failure
details.
| dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) { | ||
| // the actual inspect response is intercepted and returned as RequestInterceptedError | ||
| _, err := client.ContainerInspect(ctx, containerID) | ||
|
|
||
| var interceptedErr *httputils.RequestInterceptedError | ||
| if !httputils.AsRequestInterceptedError(err, &interceptedErr) { | ||
| return container.State{}, err | ||
| } | ||
|
|
||
| if interceptedErr == nil || interceptedErr.Data == nil { // should not happen | ||
| return container.State{}, errors.New("intercepted error is nil or data is nil") | ||
| } | ||
|
|
||
| return interceptedErr.Data.(container.State), nil | ||
| } |
There was a problem hiding this comment.
Use gperr for error handling per coding guidelines.
The function returns plain errors without using gperr for nested error handling. Lines 34 and 38 should use gperr.PrependSubject() or similar to add context to errors.
♻️ Proposed refactor to use gperr
dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) {
// the actual inspect response is intercepted and returned as RequestInterceptedError
_, err := client.ContainerInspect(ctx, containerID)
var interceptedErr *httputils.RequestInterceptedError
if !httputils.AsRequestInterceptedError(err, &interceptedErr) {
- return container.State{}, err
+ return container.State{}, gperr.PrependSubject(err, "docker inspect")
}
if interceptedErr == nil || interceptedErr.Data == nil { // should not happen
- return container.State{}, errors.New("intercepted error is nil or data is nil")
+ return container.State{}, gperr.New("intercepted error is nil or data is nil")
}
+ state, ok := interceptedErr.Data.(container.State)
+ if !ok {
+ return container.State{}, gperr.New("intercepted data is not container.State")
+ }
- return interceptedErr.Data.(container.State), nil
+ return state, nil
}As per coding guidelines: Use gperr (goutils/errs) for nested error handling: gperr.Multiline() for multi-attempt errors, gperr.NewBuilder() for collecting errors, gperr.NewGroup() + group.Go() for concurrent error collection, and gperr.PrependSubject() to prepend subject identifiers.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) { | |
| // the actual inspect response is intercepted and returned as RequestInterceptedError | |
| _, err := client.ContainerInspect(ctx, containerID) | |
| var interceptedErr *httputils.RequestInterceptedError | |
| if !httputils.AsRequestInterceptedError(err, &interceptedErr) { | |
| return container.State{}, err | |
| } | |
| if interceptedErr == nil || interceptedErr.Data == nil { // should not happen | |
| return container.State{}, errors.New("intercepted error is nil or data is nil") | |
| } | |
| return interceptedErr.Data.(container.State), nil | |
| } | |
| dockerHealthInspect = func(ctx context.Context, client *docker.SharedClient, containerID string) (container.State, error) { | |
| // the actual inspect response is intercepted and returned as RequestInterceptedError | |
| _, err := client.ContainerInspect(ctx, containerID) | |
| var interceptedErr *httputils.RequestInterceptedError | |
| if !httputils.AsRequestInterceptedError(err, &interceptedErr) { | |
| return container.State{}, gperr.PrependSubject(err, "docker inspect") | |
| } | |
| if interceptedErr == nil || interceptedErr.Data == nil { // should not happen | |
| return container.State{}, gperr.New("intercepted error is nil or data is nil") | |
| } | |
| state, ok := interceptedErr.Data.(container.State) | |
| if !ok { | |
| return container.State{}, gperr.New("intercepted data is not container.State") | |
| } | |
| return state, nil | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/health/check/docker.go` around lines 28 - 42, The function
dockerHealthInspect should wrap returned errors with gperr to add context:
replace the plain return of err when httputils.AsRequestInterceptedError fails
with a gperr-prepended error (e.g., gperr.PrependSubject(err,
"dockerHealthInspect") or build a new gperr error around err) and likewise
replace errors.New("intercepted error is nil or data is nil") with a gperr error
that includes the same subject (e.g.,
gperr.PrependSubject(gperr.New("intercepted error is nil or data is nil"),
"dockerHealthInspect")); keep the existing checks around
httputils.AsRequestInterceptedError and the interceptedErr.Data cast but return
gperr-wrapped errors instead of plain errors.
| return container.State{}, errors.New("intercepted error is nil or data is nil") | ||
| } | ||
|
|
||
| return interceptedErr.Data.(container.State), nil |
There was a problem hiding this comment.
Add safety check for type assertion to prevent panic.
The type assertion interceptedErr.Data.(container.State) can panic if the intercepted data is not of the expected type. Use the comma-ok idiom to handle this gracefully.
🛡️ Proposed fix to add safe type assertion
- return interceptedErr.Data.(container.State), nil
+ state, ok := interceptedErr.Data.(container.State)
+ if !ok {
+ return container.State{}, errors.New("intercepted data is not container.State")
+ }
+ return state, nil📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return interceptedErr.Data.(container.State), nil | |
| state, ok := interceptedErr.Data.(container.State) | |
| if !ok { | |
| return container.State{}, errors.New("intercepted data is not container.State") | |
| } | |
| return state, nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/health/check/docker.go` at line 41, The current return uses a direct
type assertion interceptedErr.Data.(container.State) which can panic; update the
code where interceptedErr is used (the return in the function that returns
(container.State, error)) to perform a safe comma-ok type assertion: attempt to
cast interceptedErr.Data to container.State using v, ok :=
interceptedErr.Data.(container.State) and if ok return v, nil, otherwise return
a zero-value container.State and an explanatory error (or wrap interceptedErr)
to avoid panic and surface the mismatch.
| func refreshCloudflareCIDRs() { | ||
| if !cfCIDRsRefreshing.CompareAndSwap(false, true) { | ||
| return | ||
| } | ||
| go func() { | ||
| defer cfCIDRsRefreshing.Store(false) | ||
|
|
||
| if common.IsTest { | ||
| cfCIDRs = localCIDRs | ||
| cfCIDRsMu.Lock() | ||
| defer cfCIDRsMu.Unlock() | ||
|
|
||
| refreshCloudflareCIDRsLocked() | ||
| }() |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Use task.RootTask() (with OnFinished()/OnCancel()) for the Cloudflare CIDR refresh goroutine.
In internal/net/gphttp/middleware/cloudflare_real_ip.go (lines 135-142), refreshCloudflareCIDRs() launches an unmanaged go func(); integrate the refresh worker into the task.RootTask()/parent.Subtask() lifecycle so cancellation and cleanup are handled via OnFinished()/OnCancel() rather than running independently.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 131 - 142,
The refreshCloudflareCIDRs function currently spawns an unmanaged goroutine;
change it to register a subtask on the root task (use task.RootTask().Subtask or
create a parent := task.RootTask() and parent.Subtask()) and run
refreshCloudflareCIDRsLocked inside that subtask so it participates in the task
lifecycle; attach cleanup with subtask.OnFinished() (or OnCancel()) to call
cfCIDRsRefreshing.Store(false) and ensure cfCIDRsMu.Lock()/Unlock() still
surround refreshCloudflareCIDRsLocked as before, and keep the CompareAndSwap
check on cfCIDRsRefreshing to avoid duplicate runs.
Treat a missing intercepted inspect response as an error and expand docker healthcheck coverage for healthy, stopped, missing-health, and threshold-exceeded cases. Also avoid mutating shared Cloudflare real IP config during CIDR refresh, reject non-200 CIDR fetches, and keep docker watcher retry errors non-blocking when the error channel is full.
Ignore invalid real IP header values and untrusted remote addrs when resolving client IPs. Preserve the original remote port when a trusted header updates RemoteAddr. Add coverage for invalid header inputs and update Cloudflare real IP tests to assert port-preserving behavior.
b4bc80b to
44c3e53
Compare
Replace the generated Go seed with an embedded cloudflare_real_ip_seed.txt snapshot and remove the CIDR generator command. Add a script and scheduled GitHub workflow to refresh the snapshot and verify the embedded seed still parses before committing updates.
44c3e53 to
29cce78
Compare
|
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
internal/health/check/docker.go (2)
44-44:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winAvoid panic-prone type assertion in inspect result.
Line 44 uses a direct assertion (
interceptedErr.Data.(container.State)), which can panic on unexpected interceptor payloads.🛡️ Proposed fix
- return interceptedErr.Data.(container.State), nil + state, ok := interceptedErr.Data.(container.State) + if !ok { + return container.State{}, gperr.PrependSubject( + gperr.New("intercepted data is not container.State"), + "dockerHealthInspect", + ) + } + return state, nil🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/health/check/docker.go` at line 44, Replace the panic-prone direct type assertion on interceptedErr.Data with a safe check: verify interceptedErr is non-nil, perform a type assertion using the comma-ok form (e.g., v, ok := interceptedErr.Data.(container.State)), and if ok return v, nil otherwise return the zero-value of container.State and a descriptive error; update any callers expecting an error to propagate this new error. Ensure you reference the interceptedErr variable, its Data field, and the container.State type when implementing the fix.
33-41:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWrap helper errors with
gperrsubject context.
dockerHealthInspectstill returns raw/standard errors on nested paths, which drops subject context and violates project error-handling conventions.♻️ Proposed fix
import ( "context" "errors" "net/http" "time" "github.com/docker/docker/api/types/container" "github.com/yusing/godoxy/internal/docker" "github.com/yusing/godoxy/internal/types" + gperr "github.com/yusing/goutils/errs" httputils "github.com/yusing/goutils/http" strutils "github.com/yusing/goutils/strings" ) @@ var interceptedErr *httputils.RequestInterceptedError if !httputils.AsRequestInterceptedError(err, &interceptedErr) { if err == nil { - err = errors.New("inspect response was not intercepted") + err = gperr.New("inspect response was not intercepted") } - return container.State{}, err + return container.State{}, gperr.PrependSubject(err, "dockerHealthInspect") } if interceptedErr == nil || interceptedErr.Data == nil { // should not happen - return container.State{}, errors.New("intercepted error is nil or data is nil") + return container.State{}, gperr.PrependSubject( + gperr.New("intercepted error is nil or data is nil"), + "dockerHealthInspect", + ) }As per coding guidelines, "Use
gperr(goutils/errs) for nested error handling ... andgperr.PrependSubject()to prepend subject identifiers."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/health/check/docker.go` around lines 33 - 41, The branch in dockerHealthInspect that handles non-intercepted or nil-intercepted responses returns plain errors and must be converted to use gperr with subject context; replace the raw errors (the errors.New("inspect response was not intercepted") and errors.New("intercepted error is nil or data is nil")) and any returned err from httputils.AsRequestInterceptedError handling with gperr-wrapped errors and call gperr.PrependSubject(err, "docker") (or an appropriate subject like "dockerHealthInspect") before returning so the subject context is preserved; ensure you also wrap the existing err variable when err != nil using gperr.Wrap or gperr.PrependSubject rather than returning a plain error.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/cloudflare-cidrs.yml:
- Around line 8-10: Top-level workflow permissions currently grant contents:
write; change the workflow to keep top-level permissions read-only (e.g.,
permissions: contents: read) and move the more-permissive contents: write to
only the job that needs it (the update job) by adding a permissions block under
that job with contents: write; ensure no other jobs retain write scope so only
the update job has elevated permissions.
- Around line 3-7: Add a top-level concurrency block to the workflow to prevent
overlapping runs when the scheduled cron and manual workflow_dispatch both
trigger; update the YAML that currently contains the on: schedule and
workflow_dispatch keys by adding a concurrency: group: "cloudflare-cidrs" (or
similar unique group name) and cancel-in-progress: true so in-progress runs are
cancelled when a new run starts, preventing commit/push races.
- Around line 16-23: Pin the mutable GitHub Action references to immutable
commit SHAs by replacing uses: actions/checkout@v4 and uses: actions/setup-go@v5
with their corresponding full commit SHAs (e.g., uses:
actions/checkout@<FULL_SHA> and uses: actions/setup-go@<FULL_SHA>); locate the
two occurrences in the workflow where those strings appear and swap the tag for
the repo commit SHA while preserving the existing with: blocks (ref: ${{
github.ref_name }} and go-version-file: go.mod). Ensure you fetch the exact
commit SHAs from the official actions/repos before committing.
In `@internal/watcher/docker_watcher.go`:
- Around line 216-219: Replace the fmt.Errorf wrapping sent to errCh with
gperr.PrependSubject to follow repo error-style: call
gperr.PrependSubject("docker watcher", err) (or the equivalent PrependSubject
usage) instead of fmt.Errorf("docker watcher: %w", err) and add an import for
the gperr package; also update the other occurrence that wraps client reinit
errors (the earlier send to errCh) to use gperr.PrependSubject as well so both
places use the same convention (refer to the errCh send sites and the docker
watcher reinitialization logic to locate the changes).
In `@scripts/update-cloudflare-cidrs.sh`:
- Around line 8-14: The script currently runs curl inside a pipeline (the for
loop calling curl then piping to awk) under /bin/sh with set -eu, which allows
curl failures to be masked by awk; change the shebang to a POSIX-compatible bash
(e.g., #!/usr/bin/env bash) and enable strict pipe failure by adding set -euo
pipefail (or set -eu; set -o pipefail) near the top, so any failed curl in the
loop (the curl in the for ... done | awk 'NF { print }' > "$tmp" pipeline) will
cause the script to fail fast and avoid writing a partial "$tmp".
---
Duplicate comments:
In `@internal/health/check/docker.go`:
- Line 44: Replace the panic-prone direct type assertion on interceptedErr.Data
with a safe check: verify interceptedErr is non-nil, perform a type assertion
using the comma-ok form (e.g., v, ok := interceptedErr.Data.(container.State)),
and if ok return v, nil otherwise return the zero-value of container.State and a
descriptive error; update any callers expecting an error to propagate this new
error. Ensure you reference the interceptedErr variable, its Data field, and the
container.State type when implementing the fix.
- Around line 33-41: The branch in dockerHealthInspect that handles
non-intercepted or nil-intercepted responses returns plain errors and must be
converted to use gperr with subject context; replace the raw errors (the
errors.New("inspect response was not intercepted") and errors.New("intercepted
error is nil or data is nil")) and any returned err from
httputils.AsRequestInterceptedError handling with gperr-wrapped errors and call
gperr.PrependSubject(err, "docker") (or an appropriate subject like
"dockerHealthInspect") before returning so the subject context is preserved;
ensure you also wrap the existing err variable when err != nil using gperr.Wrap
or gperr.PrependSubject rather than returning a plain error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: fc13e4a5-99a1-4fe1-bdda-21767ba2bc5e
📒 Files selected for processing (11)
.github/workflows/cloudflare-cidrs.ymlinternal/health/check/docker.gointernal/health/check/docker_test.gointernal/net/gphttp/middleware/cloudflare_real_ip.gointernal/net/gphttp/middleware/cloudflare_real_ip_seed.gointernal/net/gphttp/middleware/cloudflare_real_ip_seed.txtinternal/net/gphttp/middleware/cloudflare_real_ip_test.gointernal/net/gphttp/middleware/real_ip.gointernal/net/gphttp/middleware/real_ip_test.gointernal/watcher/docker_watcher.goscripts/update-cloudflare-cidrs.sh
Add workflow-level concurrency for cloudflare-cidrs updates and pin checkout/setup-go actions by commit SHA. Reduce top-level contents permission to read and grant write access only to the update job.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
internal/net/gphttp/middleware/cloudflare_real_ip.go (1)
57-61: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winUse
gperrfor combined Cloudflare IPv4/IPv6 endpoint errors (instead oferrors.Join)In
loadCloudflareCIDRs, the IPv4 and IPv6 fetch errors are merged viaerrors.Join(ipv4Err, ipv6Err);—switching to the repo’sgperrnested-error patterns (e.g.,gperr.NewBuilder()+gperr.PrependSubject()to label the IPv4 vs IPv6 endpoints) would keep error formatting/propagation consistent.loadCloudflareCIDRs = func() ([]*nettypes.CIDR, error) { ipv4CIDRs, ipv4Err := fetchCloudflareCIDRs(cfIPv4CIDRsEndpoint) ipv6CIDRs, ipv6Err := fetchCloudflareCIDRs(cfIPv6CIDRsEndpoint) if err := errors.Join(ipv4Err, ipv6Err); err != nil { return nil, err }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 57 - 61, Replace the errors.Join usage in loadCloudflareCIDRs with the repository gperr pattern: when fetchCloudflareCIDRs returns ipv4Err and ipv6Err, build a combined error using gperr.NewBuilder(), call gperr.PrependSubject(...) (or equivalent) to label the ipv4Err with cfIPv4CIDRsEndpoint and the ipv6Err with cfIPv6CIDRsEndpoint, add both errors to the builder, and return the built gperr error instead of errors.Join; keep the rest of loadCloudflareCIDRs and uses of fetchCloudflareCIDRs unchanged.
♻️ Duplicate comments (5)
.github/workflows/cloudflare-cidrs.yml (3)
8-13:⚠️ Potential issue | 🟠 Major | ⚡ Quick winScope write permission to the
updatejob only.
contents: writeat workflow scope is broader than needed. Keep workflow-level permissions read-only and grant write only where commit/push happens.Suggested fix
permissions: - contents: write + contents: read jobs: update: + permissions: + contents: write runs-on: ubuntu-latest🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/cloudflare-cidrs.yml around lines 8 - 13, Move the broad workflow-level write permission to the specific job that needs it: change the top-level permissions from "contents: write" to "contents: read" (or remove if implicit read) and add "permissions: contents: write" under the "update" job block so only the update job has write access; update the YAML entries referencing the top-level permissions and the "update" job to reflect this scoping.
16-23:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPin third-party actions to immutable commit SHAs.
actions/checkout@v4andactions/setup-go@v5are mutable tags; pin to full SHAs.What are the current full commit SHAs for: 1) actions/checkout v4 2) actions/setup-go v5 recommended for GitHub Actions workflow pinning?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/cloudflare-cidrs.yml around lines 16 - 23, The workflow uses mutable tags actions/checkout@v4 and actions/setup-go@v5; replace those with the corresponding immutable commit SHAs from each action's GitHub repo (use the full 40-character commit SHA for actions/checkout and actions/setup-go). To get the SHAs, open github.com/actions/checkout and github.com/actions/setup-go, find the v4 and v5 tags/releases (or use the repo’s refs/tags endpoints) and copy the full commit SHA for each tag, then substitute the tag references in the workflow with the full SHAs (e.g., actions/checkout@<full-sha> and actions/setup-go@<full-sha>).
3-7:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAdd workflow concurrency to prevent overlapping update runs.
Scheduled and manual runs can overlap and race on commit/push to the same branch.
Suggested fix
on: schedule: - cron: "17 3 * * 1" workflow_dispatch: +concurrency: + group: cloudflare-cidrs-${{ github.ref }} + cancel-in-progress: false🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/cloudflare-cidrs.yml around lines 3 - 7, The workflow can run overlapping scheduled and manual jobs; add a top-level concurrency stanza (adjacent to the existing on/schedule/workflow_dispatch keys) to serialize runs and cancel in-progress duplicates—for example use a unique group like "cloudflare-cidrs-update-${{ github.ref }}" or "${{ github.workflow }}-${{ github.ref }}" and set cancel-in-progress: true so the existing running job is cancelled when a new run starts; update the YAML to include this concurrency block to prevent race/overlap with schedule and workflow_dispatch.internal/net/gphttp/middleware/cloudflare_real_ip.go (1)
138-149: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy liftRun CIDR refresh in the task lifecycle, not as an unmanaged goroutine.
The detached goroutine bypasses structured cancellation/cleanup hooks; integrate this worker via
task.RootTask()/Subtask()and cleanup viaOnFinished()/OnCancel().As per coding guidelines,
**/*.go: "Usetask.RootTask()for background operations andparent.Subtask()for nesting, withOnFinished()andOnCancel()for cleanup".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/net/gphttp/middleware/cloudflare_real_ip.go` around lines 138 - 149, The function refreshCloudflareCIDRs currently spawns an unmanaged goroutine; change it to run as a managed background task using task.RootTask() and create a subtask (parent.Subtask()) for the refresh so cancellation/cleanup are handled; keep the existing concurrency guards (cfCIDRsRefreshing, cfCIDRsMu) and call refreshCloudflareCIDRsLocked() from inside the subtask, and register cleanup via subtask.OnFinished() and subtask.OnCancel() to ensure cfCIDRsRefreshing is reset (cfCIDRsRefreshing.Store(false)) and the mutex is unlocked if needed.scripts/update-cloudflare-cidrs.sh (1)
1-14:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDon’t let pipeline semantics hide
curlfailures.With
/bin/shand nopipefail, a failedcurlinfor ... done | awk ...can still produce exit code 0 fromawk.Suggested fix (POSIX-safe, no bash dependency)
tmp="$(mktemp)" -trap 'rm -f "$tmp"' EXIT +tmp_raw="$(mktemp)" +trap 'rm -f "$tmp" "$tmp_raw"' EXIT +: > "$tmp_raw" for endpoint in \ https://www.cloudflare.com/ips-v4 \ https://www.cloudflare.com/ips-v6 do - curl -fsSL "$endpoint" - printf '\n' -done | awk 'NF { print }' > "$tmp" + curl -fsSL "$endpoint" >> "$tmp_raw" + printf '\n' >> "$tmp_raw" +done + +awk 'NF { print }' "$tmp_raw" > "$tmp"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/update-cloudflare-cidrs.sh` around lines 1 - 14, The script currently pipes multiple curl calls into awk so a failing curl can be masked; change the loop to call curl for each endpoint and append its output (and a separating newline from printf) into the temporary file "$tmp", immediately check curl's exit status after each invocation and exit non‑zero (with an error message) if any curl fails, then run the awk 'NF { print }' filter on "$tmp" and write the result to "$out"; reference the loop variable "endpoint", the temp file "tmp", and the output variable "out" when making these changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Makefile`:
- Around line 146-150: Update the Makefile help text so it accurately documents
the test targets: change the description that currently claims `test` runs with
`-race` to reflect that `test` runs without `-race`, and add an entry for the
new `race-test` target; update any lines under the `help` target that reference
`test`/`-race` to mention `race-test` explicitly and clarify which target runs
with the race detector (i.e., `race-test` uses `-race` while `test` does not).
---
Outside diff comments:
In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 57-61: Replace the errors.Join usage in loadCloudflareCIDRs with
the repository gperr pattern: when fetchCloudflareCIDRs returns ipv4Err and
ipv6Err, build a combined error using gperr.NewBuilder(), call
gperr.PrependSubject(...) (or equivalent) to label the ipv4Err with
cfIPv4CIDRsEndpoint and the ipv6Err with cfIPv6CIDRsEndpoint, add both errors to
the builder, and return the built gperr error instead of errors.Join; keep the
rest of loadCloudflareCIDRs and uses of fetchCloudflareCIDRs unchanged.
---
Duplicate comments:
In @.github/workflows/cloudflare-cidrs.yml:
- Around line 8-13: Move the broad workflow-level write permission to the
specific job that needs it: change the top-level permissions from "contents:
write" to "contents: read" (or remove if implicit read) and add "permissions:
contents: write" under the "update" job block so only the update job has write
access; update the YAML entries referencing the top-level permissions and the
"update" job to reflect this scoping.
- Around line 16-23: The workflow uses mutable tags actions/checkout@v4 and
actions/setup-go@v5; replace those with the corresponding immutable commit SHAs
from each action's GitHub repo (use the full 40-character commit SHA for
actions/checkout and actions/setup-go). To get the SHAs, open
github.com/actions/checkout and github.com/actions/setup-go, find the v4 and v5
tags/releases (or use the repo’s refs/tags endpoints) and copy the full commit
SHA for each tag, then substitute the tag references in the workflow with the
full SHAs (e.g., actions/checkout@<full-sha> and actions/setup-go@<full-sha>).
- Around line 3-7: The workflow can run overlapping scheduled and manual jobs;
add a top-level concurrency stanza (adjacent to the existing
on/schedule/workflow_dispatch keys) to serialize runs and cancel in-progress
duplicates—for example use a unique group like "cloudflare-cidrs-update-${{
github.ref }}" or "${{ github.workflow }}-${{ github.ref }}" and set
cancel-in-progress: true so the existing running job is cancelled when a new run
starts; update the YAML to include this concurrency block to prevent
race/overlap with schedule and workflow_dispatch.
In `@internal/net/gphttp/middleware/cloudflare_real_ip.go`:
- Around line 138-149: The function refreshCloudflareCIDRs currently spawns an
unmanaged goroutine; change it to run as a managed background task using
task.RootTask() and create a subtask (parent.Subtask()) for the refresh so
cancellation/cleanup are handled; keep the existing concurrency guards
(cfCIDRsRefreshing, cfCIDRsMu) and call refreshCloudflareCIDRsLocked() from
inside the subtask, and register cleanup via subtask.OnFinished() and
subtask.OnCancel() to ensure cfCIDRsRefreshing is reset
(cfCIDRsRefreshing.Store(false)) and the mutex is unlocked if needed.
In `@scripts/update-cloudflare-cidrs.sh`:
- Around line 1-14: The script currently pipes multiple curl calls into awk so a
failing curl can be masked; change the loop to call curl for each endpoint and
append its output (and a separating newline from printf) into the temporary file
"$tmp", immediately check curl's exit status after each invocation and exit
non‑zero (with an error message) if any curl fails, then run the awk 'NF { print
}' filter on "$tmp" and write the result to "$out"; reference the loop variable
"endpoint", the temp file "tmp", and the output variable "out" when making these
changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2c833752-6658-44fa-86ce-0a2ef1a3be8f
📒 Files selected for processing (8)
.github/workflows/cloudflare-cidrs.yml.gitignoreMakefileinternal/net/gphttp/middleware/cloudflare_real_ip.gointernal/net/gphttp/middleware/cloudflare_real_ip_seed.gointernal/net/gphttp/middleware/cloudflare_real_ip_seed.txtinternal/net/gphttp/middleware/cloudflare_real_ip_test.goscripts/update-cloudflare-cidrs.sh
💤 Files with no reviewable changes (1)
- .gitignore
| test: ensure-webui-dist | ||
| go test -v ${BUILD_FLAGS} ./internal/... | ||
|
|
||
| race-test: ensure-webui-dist | ||
| CGO_ENABLED=1 go test -v -race ${BUILD_FLAGS} ./internal/... |
There was a problem hiding this comment.
Update help output to match the new test targets.
test no longer runs with -race, but help still documents it that way and doesn’t mention race-test.
Suggested fix
- ' test go test -race internal/...' \
+ ' test go test internal/...' \
+ ' race-test go test -race internal/...' \🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` around lines 146 - 150, Update the Makefile help text so it
accurately documents the test targets: change the description that currently
claims `test` runs with `-race` to reflect that `test` runs without `-race`, and
add an entry for the new `race-test` target; update any lines under the `help`
target that reference `test`/`-race` to mention `race-test` explicitly and
clarify which target runs with the race detector (i.e., `race-test` uses `-race`
while `test` does not).
Switch the updater to bash with pipefail and stage curl output through a raw temp file before filtering blank lines. This keeps cleanup explicit for both temp files and avoids relying on a pipeline to build the final snapshot.
|
Recreate Docker watcher clients after stream disconnects, factor docker health inspect handling for tests, and make health monitor jitter delay cancellation-aware. Add coverage for stopped container states, inspect failures, monitor teardown, and watcher reconnect behavior. Seed Cloudflare real IP CIDRs from an embedded snapshot, refresh them asynchronously with retry backoff, and avoid fetching when the CF-Connecting-IP header is absent. Harden real IP parsing to ignore invalid values and preserve the original remote port when rewriting the client address. Add the Cloudflare CIDR snapshot seed, update script, and scheduled workflow to refresh it, verify the embedded data in tests, scope workflow write permissions, add a non-race internal test target, and fix repo-root ignore patterns.



Summary by CodeRabbit
New Features
Bug Fixes
Tests
Chores