feat(entrypoint): SNI TLS passthrough on the HTTPS listener#225
Conversation
Peek ClientHello SNI before termination: tcp routes with sni_hosts proxy the byte stream to upstream; everything else is queued into normal HTTPS handling. When the SNI mux accepts, skip ACL and PROXY protocol on the inner HTTPS server. Extend Route with sni_hosts, add ConnProxy to tcp stream forwarding, update config.example.yml and stream README, and cover behavior with unit and integration tests.
📝 WalkthroughWalkthroughThis PR implements SNI TLS passthrough routing for TCP streams on the HTTPS entrypoint, refactors certificate acquisition into a standalone binary, adds an in-process HTTP handler registry for rule dispatch, removes in-process DNS provider initialization, and updates build infrastructure accordingly. Changes
Sequence DiagramsequenceDiagram
participant Client as TLS Client
participant Entrypoint as Entrypoint
participant SNIMux as SNI Mux
participant StreamRoute as Stream Route
participant Proxy as Route Proxy
participant Fallback as HTTPS Listener<br/>(Fallback)
Client->>Entrypoint: Connect to HTTPS listener
Entrypoint->>SNIMux: Accept connection
SNIMux->>SNIMux: Read TLS ClientHello<br/>(with deadline)
SNIMux->>SNIMux: Extract ServerName (SNI)
SNIMux->>SNIMux: Normalize SNI<br/>(lowercase, trim port)
alt SNI matches registered route
SNIMux->>StreamRoute: Lookup route by alias/wildcard
StreamRoute-->>SNIMux: Route found
SNIMux->>Proxy: ProxyConn(ctx, conn)
Proxy->>Proxy: Replay ClientHello bytes
Proxy-->>Client: Proxy connection<br/>(raw TCP)
else SNI not matched or extraction failed
SNIMux->>Fallback: Enqueue conn to fallback<br/>HTTPS listener
Fallback->>Fallback: Process as standard<br/>HTTPS request
Fallback-->>Client: HTTP response
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/entrypoint/sni_passthrough_integration_test.go`:
- Around line 171-176: The goroutine that reads the expected response uses
conn.Read which can return fewer bytes than requested; replace the raw Read call
with io.ReadFull(conn, buf) and handle the returned (n, err) properly (log or
fail the test on error) in the anonymous goroutine that references response and
conn; apply the same change to the second helper/goroutine at the other
occurrence (the block around lines 193-197) so both reads always fill the buffer
before writing the response.
In `@internal/entrypoint/sni_passthrough.go`:
- Around line 64-72: The AddRoute method currently calls
getOrStart(newSNIListenKey(...)) before checking that route.Stream() implements
nettypes.ConnProxy, which can bind the shared SNI listener even when the route
cannot proxy connections; update sniPassthroughManager.AddRoute to perform the
type assertion on route.Stream() (route.Stream().(nettypes.ConnProxy)) and
return the error if it fails before calling getOrStart/newSNIListenKey, then
proceed to call getOrStart and bind only after the proxy is validated (retain
error messages and use existing symbols sniPassthroughRoute,
sniPassthroughManager.AddRoute, getOrStart, newSNIListenKey).
- Around line 309-317: queuedListener.Close currently just closes l.ch leaving
any buffered connections open and leaking sockets; modify Close (the method on
queuedListener) so after setting l.closed and closing l.ch it drains the channel
by looping over its remaining entries (for conn := range l.ch) and calling
conn.Close() (or the appropriate Close method) for each, using l.closedMu and
l.closed as currently to guard state; this ensures all queued fallback
connections are closed on shutdown.
In `@internal/route/route.go`:
- Around line 352-361: Validation currently accepts SNIHosts values that will
never match (e.g., "*foo.example.com", "*.", "example.com:443"); inside the
usesSNIPassthrough() branch in the validation for SNIHosts (the block that
checks r.Scheme vs route.SchemeTCP and iterates r.SNIHosts), tighten checks to
reject any empty string, any value containing a colon, any value equal to "*" or
"*.", and any value that starts with "*" but not with "*."; only allow either an
exact hostname (no wildcard) or a wildcard prefixed exactly with "*." (e.g.,
"*.example.com"); return/add validation errors via errs.Adds when these invalid
patterns are found.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 42bc08e4-915c-4847-a741-db6caddcf41e
📒 Files selected for processing (13)
config.example.ymlinternal/entrypoint/entrypoint.gointernal/entrypoint/http_server.gointernal/entrypoint/routes.gointernal/entrypoint/sni_passthrough.gointernal/entrypoint/sni_passthrough_integration_test.gointernal/entrypoint/sni_passthrough_test.gointernal/net/types/stream.gointernal/route/route.gointernal/route/route_test.gointernal/route/stream/README.mdinternal/route/stream/tcp_tcp.gointernal/route/stream/tcp_tcp_test.go
| if r.usesSNIPassthrough() { | ||
| if r.Scheme != route.SchemeTCP { | ||
| errs.Adds("sni_hosts is only supported for tcp routes") | ||
| } | ||
| for _, host := range r.SNIHosts { | ||
| if host == "" { | ||
| errs.Adds("sni_hosts cannot contain empty host patterns") | ||
| break | ||
| } | ||
| } |
There was a problem hiding this comment.
Reject malformed sni_hosts values during validation.
internal/entrypoint/sni_passthrough.go only treats *.-prefixed entries as wildcards. Values like *foo.example.com, *. or example.com:443 pass here but will never match any TLS SNI, so the route silently falls back instead of failing fast.
Proposed validation tightening
if r.usesSNIPassthrough() {
if r.Scheme != route.SchemeTCP {
errs.Adds("sni_hosts is only supported for tcp routes")
}
for _, host := range r.SNIHosts {
if host == "" {
errs.Adds("sni_hosts cannot contain empty host patterns")
break
}
+ if strings.Contains(host, ":") {
+ errs.Adds("sni_hosts must not include a port")
+ break
+ }
+ if strings.Contains(host, "*") && !strings.HasPrefix(host, "*.") {
+ errs.Adds("sni_hosts wildcard patterns must start with '*.'")
+ break
+ }
+ if host == "*." {
+ errs.Adds("sni_hosts wildcard suffix cannot be empty")
+ break
+ }
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if r.usesSNIPassthrough() { | |
| if r.Scheme != route.SchemeTCP { | |
| errs.Adds("sni_hosts is only supported for tcp routes") | |
| } | |
| for _, host := range r.SNIHosts { | |
| if host == "" { | |
| errs.Adds("sni_hosts cannot contain empty host patterns") | |
| break | |
| } | |
| } | |
| if r.usesSNIPassthrough() { | |
| if r.Scheme != route.SchemeTCP { | |
| errs.Adds("sni_hosts is only supported for tcp routes") | |
| } | |
| for _, host := range r.SNIHosts { | |
| if host == "" { | |
| errs.Adds("sni_hosts cannot contain empty host patterns") | |
| break | |
| } | |
| if strings.Contains(host, ":") { | |
| errs.Adds("sni_hosts must not include a port") | |
| break | |
| } | |
| if strings.Contains(host, "*") && !strings.HasPrefix(host, "*.") { | |
| errs.Adds("sni_hosts wildcard patterns must start with '*.'") | |
| break | |
| } | |
| if host == "*." { | |
| errs.Adds("sni_hosts wildcard suffix cannot be empty") | |
| break | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/route/route.go` around lines 352 - 361, Validation currently accepts
SNIHosts values that will never match (e.g., "*foo.example.com", "*.",
"example.com:443"); inside the usesSNIPassthrough() branch in the validation for
SNIHosts (the block that checks r.Scheme vs route.SchemeTCP and iterates
r.SNIHosts), tighten checks to reject any empty string, any value containing a
colon, any value equal to "*" or "*.", and any value that starts with "*" but
not with "*."; only allow either an exact hostname (no wildcard) or a wildcard
prefixed exactly with "*." (e.g., "*.example.com"); return/add validation errors
via errs.Adds when these invalid patterns are found.
Expose optional `certificate_key_type` so lego uses the chosen leaf key type. Default stays EC256; EC384 and RSA sizes are accepted with common aliases. Validate unknown values during config validation, merge the field from extras, and extend README, config example, and tests.
Introduce `cmd/autocert` as its own Go module with an `obtain` CLI, wiring DNS provider init and autocert operations appropriate for that binary. Move ACME-heavy tests under `cmd/autocert`, relocate OVH DNS tests beside `internal/dnsproviders`, and refresh Docker/Makefile so deps stages fetch the nested modules and the image installs `/app/autocert` next to the main binary when building godoxy. Trim indirect pulls from `go.sum` and `agent/go.sum` and refactor `internal/autocert` (provider splits, state, README, and tests) so lego-backed logic stays behind the new boundary without dragging those deps into every consumer.
…lers (#226) Introduce the `handle <name>` command so rules can hand off a request to a handler registered in the in-process registry. Document it in the rules README and cover parsing plus end-to-end behavior in tests.
…PS listener Drop `sni_hosts` and wildcard SNI matching; register passthrough when a TCP route listens on `HTTPS_ADDR`, then map ClientHello `server_name` to the route alias (first label before a dot, or the full alias). Update config example and stream README for `port: listen:proxy` on the HTTPS entrypoint. Drain and close queued connections when the HTTPS passthrough listener closes; reject empty aliases. Remove related route validation and tests; refresh integration and unit tests. Bump `internal/go-oidc` and adjust `ObtainCertAll` mock to take `context.Context`.
|
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cmd/autocert/custom_test.go (1)
458-494:⚠️ Potential issue | 🟡 MinorMissing cleanup of generated certificate files.
The test creates certificate and key files in the
certs/directory (lines 469-471) but does not clean them up after the test completes. This can lead to test pollution and leftover files.Proposed fix
func TestObtainCertFromCustomProvider(t *testing.T) { // Create a test ACME server acmeServer := newTestACMEServer(t) defer acmeServer.Close() + t.Cleanup(func() { os.RemoveAll("certs") }) t.Run("obtain cert from custom step-ca server", func(t *testing.T) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/autocert/custom_test.go` around lines 458 - 494, The test TestObtainCertFromCustomProvider writes files at cfg.CertPath, cfg.KeyPath and cfg.ACMEKeyPath but never removes them; add cleanup using t.Cleanup (or defer) immediately after cfg is created to remove cfg.CertPath, cfg.KeyPath and cfg.ACMEKeyPath (ignore non-existent-file errors) so obtainCert/TestObtainCertFromCustomProvider does not leave artifacts; reference the cfg variable and the path fields (CertPath, KeyPath, ACMEKeyPath) when adding the cleanup.internal/autocert/provider.go (1)
369-397:⚠️ Potential issue | 🟠 MajorSkip local/pseudo providers here instead of treating them as a failed wait.
ForceExpiryAll()already treats renewal as successful when any eligible provider enqueues work, butWaitRenewalDone()returnsfalseas soon as it hits a local/pseudo provider. In mixed configs, that makes the API report “no renewal worker enqueued” even after waiting on a real ACME renewal.🛠️ Proposed fix
func (p *Provider) WaitRenewalDone(ctx context.Context) bool { - if p.neverStartsRenewalWorker() { - if len(p.extraProviders) == 0 { - return false - } - for _, ep := range p.extraProviders { - if !ep.WaitRenewalDone(ctx) { - return false - } - } - return true - } - - if p.cfg.Provider != ProviderLocal && p.cfg.Provider != ProviderPseudo { + waited := false + if !p.neverStartsRenewalWorker() { done := p.forceRenewalDoneCh.Load() if done == nil || *done == nil { return false } select { case <-*done: + waited = true case <-ctx.Done(): return false } } for _, ep := range p.extraProviders { - if !ep.WaitRenewalDone(ctx) { - return false + if ep.WaitRenewalDone(ctx) { + waited = true } } - return true + return waited }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/autocert/provider.go` around lines 369 - 397, The loop in WaitRenewalDone treats extraProviders that are ProviderLocal or ProviderPseudo as a failed wait; change the logic in WaitRenewalDone so that when iterating p.extraProviders you skip providers whose cfg.Provider (or provider identifier) equals ProviderLocal or ProviderPseudo instead of returning false immediately; ensure the function still returns false if a non-local/pseudo provider’s WaitRenewalDone(ctx) returns false, and keep the existing behavior for the main provider checks (forceRenewalDoneCh) and for other providers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/autocert/main.go`:
- Around line 30-41: The run function currently returns a generic "unknown
command" error for unrecognized args[0]; update run to return a helpful
usage/help message listing available commands (e.g., "obtain") and optionally
show usage syntax when args[0] is not recognized. Modify the default case in run
(and ensure runObtain remains unchanged) to construct an error string or help
output that includes the unrecognized command and the list of valid
commands/brief usage, so callers get actionable guidance.
In `@internal/autocert/forceexpiry_test.go`:
- Around line 58-68: The tests create a Provider without initializing its cfg,
causing a nil pointer panic when Provider.ForceExpiryAll dereferences p.cfg; fix
the tests by initializing cfg (e.g., set cfg: &Config{Provider: ProviderLocal})
on the Provider instances used in the test(s) that call ForceExpiryAll (also for
the extra provider), so Provider.cfg is non-nil before calling ForceExpiryAll;
ensure the same fix is applied to the other failing tests referenced.
In `@internal/autocert/provider_nolego.go`:
- Around line 12-14: The certState method on type Provider unconditionally
returns CertStateValid which bypasses renewal logic (affecting renew when using
renewModeIfNeeded); either implement proper certificate state detection or
document the intent: if this file is for no-lego builds and expiry is checked
elsewhere, add a clear comment above func (p *Provider) certState() explaining
that decision and where expiry is validated; otherwise change certState to
inspect the Provider's stored certificate (e.g., the parsed x509 leaf or expiry
timestamp available on p) and return
CertStateExpired/CertStateNearExpiry/CertStateValid accordingly so renew() and
renewModeIfNeeded behave correctly (referencing Provider.certState,
Provider.renew, and renewModeIfNeeded to locate call sites).
In `@internal/autocert/state_test.go`:
- Around line 35-40: The test incorrectly asserts the resolved binary path
equals "autocert" which breaks when obtainCertUsingBinary() returns an absolute
sibling path; update the assertion inside autocertCommandRunner to check the
basename of the resolved name (e.g., filepath.Base(name) == "autocert") or use
strings.HasSuffix to assert the binary name, and keep the existing assertions
that args contain "obtain" and certPath so the test validates the command and
arguments rather than the exact path.
In `@internal/autocert/state.go`:
- Around line 60-74: In obtainCertUsingBinary, when autocertCommandRunner
returns an error that matches exec.ErrNotFound you should preserve/wrap the
original error instead of discarding it: update the return to use fmt.Errorf to
include context (e.g., which binary/path was attempted via
resolveAutocertBinary()) and wrap the original err with %w; add "fmt" to
imports. Keep the existing behavior of returning non-nil non-ErrNotFound errors
directly and only wrap when ErrNotFound is detected.
In `@internal/dnsproviders/gen.py`:
- Line 97: The os.execvp("gofmt", [...]) call should resolve the full path to
the gofmt binary before executing to avoid PATH hijacking and improve
reproducibility: use shutil.which("gofmt") at the top of the routine (or before
the execvp call), check the result and raise/exit with a clear error if None,
then call os.execvp(resolved_path, [resolved_path, "-w", "providers.go",
"../autocert/provider_placeholders_gen.go"]); ensure shutil is imported if not
already and preserve the same argument list and behavior of the existing
os.execvp invocation.
In `@internal/entrypoint/sni_passthrough_integration_test.go`:
- Around line 182-201: The goroutine that accepts connections and reads/writes
using ln, conn and io.ReadFull currently calls t.Errorf from inside the
goroutine (unsafe because it may outlive the test); change this so errors are
reported back to the main test goroutine instead of calling t.Errorf directly —
e.g., create an error channel (chan error) that the inner goroutine sends any
read/write error to, close the channel when the listener loop exits, and have
the main test goroutine receive from that channel and call t.Errorf once as
needed; alternatively, if you prefer the simpler approach noted in the comment,
remove the t.Errorf calls from the goroutine and rely on the client-side
assertions to fail the test instead, ensuring no test API is invoked from
background goroutines.
In `@internal/entrypoint/sni_passthrough.go`:
- Around line 91-95: The code currently always uses common.ProxyHTTPSAddr when
calling m.getOrStart(newSNIListenKeyForAddr(...)) which ignores
route.ListenURL() and causes SNI muxes to be indexed incorrectly; change calls
that build the listen key (where newSNIListenKeyForAddr is invoked before
m.getOrStart) to use the route's own listen URL (route.ListenURL()) instead of
common.ProxyHTTPSAddr so that m.getOrStart/newSNIListenKeyForAddr and subsequent
mux.addRoute / mux.delRoute operate against the same listener; apply the same
fix to the corresponding delete/remove path that currently mirrors lines 98-100
so AddRoute/DelRoute and HTTPSListener(addr) speak to the same mux.
- Around line 350-355: Change the lookup order to try an exact SNI match before
falling back to the first-label alias: call m.routes.Load(serverName) first and
if it returns found/true return that target; only if the exact lookup fails, use
strings.Cut(serverName, ".") to extract alias and then attempt
m.routes.Load(alias) (returning that if found). Update the logic around
serverName, strings.Cut and m.routes.Load to implement this exact-then-alias
lookup.
In `@internal/route/rules/do.go`:
- Around line 187-196: Perform the handler lookup once during rule construction
instead of on every request: inside the build closure (the build: func(args any)
HandlerFunc { ... }) cast name := args.(string), call handler, ok :=
GetHandler(name) there and capture the returned handler variable; if ok is false
return a HandlerFunc that immediately returns fmt.Errorf("handler %q not found",
name) (or alternatively add the same GetHandler(name) check in the rule's
validate function to fail early), and in the normal case have the returned
HandlerFunc call handler.ServeHTTP(w, r) and return errTerminateRule so runtime
per-request lookups are eliminated.
---
Outside diff comments:
In `@cmd/autocert/custom_test.go`:
- Around line 458-494: The test TestObtainCertFromCustomProvider writes files at
cfg.CertPath, cfg.KeyPath and cfg.ACMEKeyPath but never removes them; add
cleanup using t.Cleanup (or defer) immediately after cfg is created to remove
cfg.CertPath, cfg.KeyPath and cfg.ACMEKeyPath (ignore non-existent-file errors)
so obtainCert/TestObtainCertFromCustomProvider does not leave artifacts;
reference the cfg variable and the path fields (CertPath, KeyPath, ACMEKeyPath)
when adding the cleanup.
In `@internal/autocert/provider.go`:
- Around line 369-397: The loop in WaitRenewalDone treats extraProviders that
are ProviderLocal or ProviderPseudo as a failed wait; change the logic in
WaitRenewalDone so that when iterating p.extraProviders you skip providers whose
cfg.Provider (or provider identifier) equals ProviderLocal or ProviderPseudo
instead of returning false immediately; ensure the function still returns false
if a non-local/pseudo provider’s WaitRenewalDone(ctx) returns false, and keep
the existing behavior for the main provider checks (forceRenewalDoneCh) and for
other providers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: b4c9dcf3-f3c6-4418-be85-a0c747f47722
⛔ Files ignored due to path filters (4)
agent/go.sumis excluded by!**/*.sumcmd/autocert/go.sumis excluded by!**/*.sumgo.sumis excluded by!**/*.suminternal/dnsproviders/go.sumis excluded by!**/*.sum
📒 Files selected for processing (43)
DockerfileMakefilecmd/autocert/config_test.gocmd/autocert/custom_test.gocmd/autocert/go.modcmd/autocert/main.gocmd/autocert/multi_cert_test.gocmd/autocert/obtain.gocmd/autocert/obtain_test.gocmd/main.goconfig.example.ymldev.compose.ymlgo.modinternal/api/csrf_test.gointernal/api/handler.gointernal/autocert/README.mdinternal/autocert/config.gointernal/autocert/config_test.gointernal/autocert/forceexpiry_test.gointernal/autocert/provider.gointernal/autocert/provider_nolego.gointernal/autocert/provider_placeholders_gen.gointernal/autocert/provider_test/sni_test.gointernal/autocert/setup.gointernal/autocert/setup_test.gointernal/autocert/state.gointernal/autocert/state_test.gointernal/autocert/types/provider.gointernal/config/state.gointernal/dnsproviders/gen.pyinternal/dnsproviders/go.modinternal/dnsproviders/ovh_test.gointernal/entrypoint/inbound_mtls_test.gointernal/entrypoint/sni_passthrough.gointernal/entrypoint/sni_passthrough_integration_test.gointernal/entrypoint/sni_passthrough_test.gointernal/go-oidcinternal/route/rules/README.mdinternal/route/rules/do.gointernal/route/rules/do_test.gointernal/route/rules/handler_registry.gointernal/route/rules/http_flow_block_test.gointernal/route/stream/README.md
💤 Files with no reviewable changes (4)
- internal/autocert/setup_test.go
- cmd/main.go
- internal/autocert/config_test.go
- go.mod
| func run(args []string) error { | ||
| if len(args) == 0 { | ||
| return errors.New("missing command") | ||
| } | ||
|
|
||
| switch args[0] { | ||
| case "obtain": | ||
| return runObtain(args[1:]) | ||
| default: | ||
| return fmt.Errorf("unknown command %q", args[0]) | ||
| } | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding help/usage output for unrecognized commands.
When an unknown command is provided, the error message could be more helpful by listing available commands.
Proposed enhancement
func run(args []string) error {
if len(args) == 0 {
- return errors.New("missing command")
+ return errors.New("missing command; available: obtain")
}
switch args[0] {
case "obtain":
return runObtain(args[1:])
default:
- return fmt.Errorf("unknown command %q", args[0])
+ return fmt.Errorf("unknown command %q; available: obtain", args[0])
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func run(args []string) error { | |
| if len(args) == 0 { | |
| return errors.New("missing command") | |
| } | |
| switch args[0] { | |
| case "obtain": | |
| return runObtain(args[1:]) | |
| default: | |
| return fmt.Errorf("unknown command %q", args[0]) | |
| } | |
| } | |
| func run(args []string) error { | |
| if len(args) == 0 { | |
| return errors.New("missing command; available: obtain") | |
| } | |
| switch args[0] { | |
| case "obtain": | |
| return runObtain(args[1:]) | |
| default: | |
| return fmt.Errorf("unknown command %q; available: obtain", args[0]) | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cmd/autocert/main.go` around lines 30 - 41, The run function currently
returns a generic "unknown command" error for unrecognized args[0]; update run
to return a helpful usage/help message listing available commands (e.g.,
"obtain") and optionally show usage syntax when args[0] is not recognized.
Modify the default case in run (and ensure runObtain remains unchanged) to
construct an error string or help output that includes the unrecognized command
and the list of valid commands/brief usage, so callers get actionable guidance.
| provider := &Provider{ | ||
| forceRenewalCh: make(chan struct{}, 1), | ||
| extraProviders: []*Provider{extra}, | ||
| } | ||
| provider.forceRenewalDoneCh.Store(&mainDone) | ||
|
|
||
| ok := provider.ForceExpiryAll() | ||
| require.True(t, ok) | ||
| require.Empty(t, provider.forceRenewalCh) | ||
| require.Len(t, extra.forceRenewalCh, 1) | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check what ForceExpiryAll and related methods access on Provider
ast-grep --pattern 'func ($P *Provider) ForceExpiryAll() $_ {
$$$
}'
rg -n 'p\.cfg' internal/autocert/provider.go -C2Repository: yusing/godoxy
Length of output: 2962
Provider created without cfg field will panic when ForceExpiryAll() is called.
The test at lines 58-61 initializes a Provider with only forceRenewalCh and extraProviders, leaving cfg as nil. When ForceExpiryAll() is invoked at line 65, it immediately dereferences p.cfg on the first conditional check (p.cfg.Provider != ProviderLocal...), causing a nil pointer panic. The same issue exists in other tests at lines 70-72 and 82-84.
Initialize cfg with valid values (e.g., cfg: &Config{Provider: ProviderLocal}) or use a Provider that properly sets this field during construction.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/autocert/forceexpiry_test.go` around lines 58 - 68, The tests create
a Provider without initializing its cfg, causing a nil pointer panic when
Provider.ForceExpiryAll dereferences p.cfg; fix the tests by initializing cfg
(e.g., set cfg: &Config{Provider: ProviderLocal}) on the Provider instances used
in the test(s) that call ForceExpiryAll (also for the extra provider), so
Provider.cfg is non-nil before calling ForceExpiryAll; ensure the same fix is
applied to the other failing tests referenced.
| func (p *Provider) certState() CertState { | ||
| return CertStateValid | ||
| } |
There was a problem hiding this comment.
certState() always returns CertStateValid, bypassing renewal checks.
This implementation always returns CertStateValid, which means renew() with renewModeIfNeeded will always skip renewal (lines 17-18). If this is intentional for non-lego builds where the binary handles expiry checking, consider adding a comment explaining this design decision. Otherwise, this appears to be a bug that prevents proper certificate renewal.
#!/bin/bash
# Check how certState is used elsewhere and if there's a lego-enabled implementation
ast-grep --pattern 'func ($P *Provider) certState() CertState {
$$$
}'
# Check if there's a build-tag-based implementation
rg -n 'certState' --type go -C3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/autocert/provider_nolego.go` around lines 12 - 14, The certState
method on type Provider unconditionally returns CertStateValid which bypasses
renewal logic (affecting renew when using renewModeIfNeeded); either implement
proper certificate state detection or document the intent: if this file is for
no-lego builds and expiry is checked elsewhere, add a clear comment above func
(p *Provider) certState() explaining that decision and where expiry is
validated; otherwise change certState to inspect the Provider's stored
certificate (e.g., the parsed x509 leaf or expiry timestamp available on p) and
return CertStateExpired/CertStateNearExpiry/CertStateValid accordingly so
renew() and renewModeIfNeeded behave correctly (referencing Provider.certState,
Provider.renew, and renewModeIfNeeded to locate call sites).
| autocertCommandRunner = func(ctx context.Context, name string, args ...string) error { | ||
| called = true | ||
| require.NotNil(t, ctx) | ||
| require.Equal(t, "autocert", name) | ||
| require.True(t, slices.Contains(args, "obtain")) | ||
| require.True(t, slices.Contains(args, certPath)) |
There was a problem hiding this comment.
Avoid asserting the resolved binary path is exactly "autocert".
obtainCertUsingBinary() resolves a sibling executable first, so this test will fail in environments where the runner receives an absolute path even though the behavior is correct. Assert the basename/args instead.
🧪 Proposed fix
- require.Equal(t, "autocert", name)
+ require.Contains(t, filepath.Base(name), "autocert")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| autocertCommandRunner = func(ctx context.Context, name string, args ...string) error { | |
| called = true | |
| require.NotNil(t, ctx) | |
| require.Equal(t, "autocert", name) | |
| require.True(t, slices.Contains(args, "obtain")) | |
| require.True(t, slices.Contains(args, certPath)) | |
| autocertCommandRunner = func(ctx context.Context, name string, args ...string) error { | |
| called = true | |
| require.NotNil(t, ctx) | |
| require.Contains(t, filepath.Base(name), "autocert") | |
| require.True(t, slices.Contains(args, "obtain")) | |
| require.True(t, slices.Contains(args, certPath)) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/autocert/state_test.go` around lines 35 - 40, The test incorrectly
asserts the resolved binary path equals "autocert" which breaks when
obtainCertUsingBinary() returns an absolute sibling path; update the assertion
inside autocertCommandRunner to check the basename of the resolved name (e.g.,
filepath.Base(name) == "autocert") or use strings.HasSuffix to assert the binary
name, and keep the existing assertions that args contain "obtain" and certPath
so the test validates the command and arguments rather than the exact path.
| func obtainCertUsingBinary(ctx context.Context, certPath string) error { | ||
| err := autocertCommandRunner( | ||
| ctx, | ||
| resolveAutocertBinary(), | ||
| "obtain", | ||
| "--config", | ||
| common.ConfigPath, | ||
| "--cert-path", | ||
| certPath, | ||
| ) | ||
| if err == nil || !errors.Is(err, exec.ErrNotFound) { | ||
| return err | ||
| } | ||
| return errors.New("autocert binary not found; place autocert next to main binary or on PATH") | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider wrapping the original error for better debugging.
When exec.ErrNotFound is detected, the original error is discarded. Wrapping it could provide more context for debugging (e.g., which path was tried).
Proposed improvement
func obtainCertUsingBinary(ctx context.Context, certPath string) error {
+ binaryPath := resolveAutocertBinary()
err := autocertCommandRunner(
ctx,
- resolveAutocertBinary(),
+ binaryPath,
"obtain",
"--config",
common.ConfigPath,
"--cert-path",
certPath,
)
if err == nil || !errors.Is(err, exec.ErrNotFound) {
return err
}
- return errors.New("autocert binary not found; place autocert next to main binary or on PATH")
+ return fmt.Errorf("autocert binary not found at %q; place autocert next to main binary or on PATH", binaryPath)
}Note: This requires adding "fmt" to imports.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func obtainCertUsingBinary(ctx context.Context, certPath string) error { | |
| err := autocertCommandRunner( | |
| ctx, | |
| resolveAutocertBinary(), | |
| "obtain", | |
| "--config", | |
| common.ConfigPath, | |
| "--cert-path", | |
| certPath, | |
| ) | |
| if err == nil || !errors.Is(err, exec.ErrNotFound) { | |
| return err | |
| } | |
| return errors.New("autocert binary not found; place autocert next to main binary or on PATH") | |
| } | |
| func obtainCertUsingBinary(ctx context.Context, certPath string) error { | |
| binaryPath := resolveAutocertBinary() | |
| err := autocertCommandRunner( | |
| ctx, | |
| binaryPath, | |
| "obtain", | |
| "--config", | |
| common.ConfigPath, | |
| "--cert-path", | |
| certPath, | |
| ) | |
| if err == nil || !errors.Is(err, exec.ErrNotFound) { | |
| return err | |
| } | |
| return fmt.Errorf("autocert binary not found at %q; place autocert next to main binary or on PATH", binaryPath) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/autocert/state.go` around lines 60 - 74, In obtainCertUsingBinary,
when autocertCommandRunner returns an error that matches exec.ErrNotFound you
should preserve/wrap the original error instead of discarding it: update the
return to use fmt.Errorf to include context (e.g., which binary/path was
attempted via resolveAutocertBinary()) and wrap the original err with %w; add
"fmt" to imports. Keep the existing behavior of returning non-nil
non-ErrNotFound errors directly and only wrap when ErrNotFound is detected.
| f.write(f'\t\t"{name}",\n') | ||
| f.write("\t)\n}\n") | ||
|
|
||
| os.execvp("gofmt", ["gofmt", "-w", "providers.go", "../autocert/provider_placeholders_gen.go"]) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Python's os.execvp search PATH for a bare executable name, and is resolving the absolute path first the recommended way to avoid PATH-hijack/reproducibility issues for tool invocations?
💡 Result:
Yes, Python's os.execvp searches the PATH environment variable for a bare executable name (no slashes in the filename). If the filename contains a slash, it uses that path directly without searching PATH. This behavior matches the POSIX execvp specification, which Python follows on Unix-like systems. Using a bare name exposes invocations to PATH hijacking attacks, where a malicious executable with the same name is placed earlier in PATH (e.g., in a user-writable directory prepended to PATH), and reproducibility issues, as the resolved binary can vary by environment PATH configuration. The recommended practice for secure, reproducible tool invocations is to resolve and use the absolute path first (e.g., via shutil.which or os.path.abspath(shutil.which('tool')) then os.execv(abs_path, args)), bypassing PATH search entirely. This ensures the exact intended binary is executed regardless of PATH modifications.
Citations:
- 1: https://docs.python.org/3/library/os.html
- 2: https://tedboy.github.io/python_stdlib/generated/generated/os.execvp.html
- 3: http://docs.python.org/release/2.4.4/lib/os-process.html
- 4: https://pubs.opengroup.org/onlinepubs/9699919799/functions/execvp.html
- 5: https://stackoverflow.com/questions/19205316/execvp-filepath-clarification
- 6: https://stackoverflow.com/questions/49225695/execv-vs-execvp-why-just-one-of-them-require-the-exact-files-path
- 7: https://blog.golang.org/path-security
- 8: https://www.unix-ninja.com/p/the_dangers_of_prepending_path
- 9: https://blog.deephacking.tech/en/posts/how-to-exploit-path-hijacking-and-library-hijacking/
🏁 Script executed:
# First, check if the file exists and read the relevant section
fd "gen.py" | head -10Repository: yusing/godoxy
Length of output: 86
🏁 Script executed:
# Read the file around line 97 to see the exact code
cat -n internal/dnsproviders/gen.py | sed -n '85,105p'Repository: yusing/godoxy
Length of output: 520
🏁 Script executed:
# Check the top of the file to see current imports
cat -n internal/dnsproviders/gen.py | head -20Repository: yusing/godoxy
Length of output: 789
Resolve gofmt path explicitly to avoid PATH hijacking and improve reproducibility.
Line 97 executes gofmt by bare name, causing Python to search PATH for the binary. This creates a reproducibility footgun where the resolved binary varies by environment and enables PATH hijacking attacks. Use shutil.which() to resolve the path upfront and fail fast if the tool is missing.
Proposed fix
import os
import re
+import shutil
+import subprocess
import requests
...
-os.execvp("gofmt", ["gofmt", "-w", "providers.go", "../autocert/provider_placeholders_gen.go"])
+gofmt = shutil.which("gofmt")
+if gofmt is None:
+ raise RuntimeError("gofmt not found in PATH")
+
+subprocess.run(
+ [gofmt, "-w", "providers.go", "../autocert/provider_placeholders_gen.go"],
+ check=True,
+)🧰 Tools
🪛 Ruff (0.15.12)
[error] 97-97: Starting a process without a shell
(S606)
[error] 97-97: Starting a process with a partial executable path
(S607)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/dnsproviders/gen.py` at line 97, The os.execvp("gofmt", [...]) call
should resolve the full path to the gofmt binary before executing to avoid PATH
hijacking and improve reproducibility: use shutil.which("gofmt") at the top of
the routine (or before the execvp call), check the result and raise/exit with a
clear error if None, then call os.execvp(resolved_path, [resolved_path, "-w",
"providers.go", "../autocert/provider_placeholders_gen.go"]); ensure shutil is
imported if not already and preserve the same argument list and behavior of the
existing os.execvp invocation.
| go func() { | ||
| for { | ||
| conn, err := ln.Accept() | ||
| if err != nil { | ||
| return | ||
| } | ||
| go func() { | ||
| defer conn.Close() | ||
| buf := make([]byte, len(response)) | ||
| n, err := io.ReadFull(conn, buf) | ||
| if err != nil { | ||
| t.Errorf("read upstream request: read %d/%d bytes: %v", n, len(buf), err) | ||
| return | ||
| } | ||
| if _, err := conn.Write([]byte(response)); err != nil { | ||
| t.Errorf("write upstream response: %v", err) | ||
| } | ||
| }() | ||
| } | ||
| }() |
There was a problem hiding this comment.
Calling t.Errorf from a goroutine can cause a panic after the test exits.
The goroutine at lines 188-199 calls t.Errorf which is unsafe when the goroutine outlives the test. If the listener accepts a connection after the test function returns (e.g., during cleanup race), this will panic. Consider using a channel to communicate errors back to the main test goroutine, or use t.Cleanup with synchronization to ensure all goroutines complete.
Proposed fix using error channel
func startTLSEchoServer(t *testing.T, response string) *tlsEchoServer {
t.Helper()
cert := newLocalhostCert(t)
ln, err := tls.Listen("tcp", "127.0.0.1:0", &tls.Config{
Certificates: []tls.Certificate{*cert},
MinVersion: tls.VersionTLS12,
})
require.NoError(t, err)
t.Cleanup(func() { _ = ln.Close() })
go func() {
for {
conn, err := ln.Accept()
if err != nil {
return
}
go func() {
defer conn.Close()
buf := make([]byte, len(response))
n, err := io.ReadFull(conn, buf)
if err != nil {
- t.Errorf("read upstream request: read %d/%d bytes: %v", n, len(buf), err)
+ // Log instead of t.Errorf to avoid panic if goroutine outlives test
return
}
if _, err := conn.Write([]byte(response)); err != nil {
- t.Errorf("write upstream response: %v", err)
+ // Log instead of t.Errorf to avoid panic if goroutine outlives test
+ return
}
}()
}
}()
return &tlsEchoServer{ln: ln}
}Alternatively, since these are test helpers where errors indicate test bugs rather than failures, simply removing the t.Errorf calls (already returning early) is acceptable since the test will fail on the client side if the response isn't received.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| go func() { | |
| for { | |
| conn, err := ln.Accept() | |
| if err != nil { | |
| return | |
| } | |
| go func() { | |
| defer conn.Close() | |
| buf := make([]byte, len(response)) | |
| n, err := io.ReadFull(conn, buf) | |
| if err != nil { | |
| t.Errorf("read upstream request: read %d/%d bytes: %v", n, len(buf), err) | |
| return | |
| } | |
| if _, err := conn.Write([]byte(response)); err != nil { | |
| t.Errorf("write upstream response: %v", err) | |
| } | |
| }() | |
| } | |
| }() | |
| go func() { | |
| for { | |
| conn, err := ln.Accept() | |
| if err != nil { | |
| return | |
| } | |
| go func() { | |
| defer conn.Close() | |
| buf := make([]byte, len(response)) | |
| n, err := io.ReadFull(conn, buf) | |
| if err != nil { | |
| // Log instead of t.Errorf to avoid panic if goroutine outlives test | |
| return | |
| } | |
| if _, err := conn.Write([]byte(response)); err != nil { | |
| // Log instead of t.Errorf to avoid panic if goroutine outlives test | |
| return | |
| } | |
| }() | |
| } | |
| }() |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/entrypoint/sni_passthrough_integration_test.go` around lines 182 -
201, The goroutine that accepts connections and reads/writes using ln, conn and
io.ReadFull currently calls t.Errorf from inside the goroutine (unsafe because
it may outlive the test); change this so errors are reported back to the main
test goroutine instead of calling t.Errorf directly — e.g., create an error
channel (chan error) that the inner goroutine sends any read/write error to,
close the channel when the listener loop exits, and have the main test goroutine
receive from that channel and call t.Errorf once as needed; alternatively, if
you prefer the simpler approach noted in the comment, remove the t.Errorf calls
from the goroutine and rely on the client-side assertions to fail the test
instead, ensuring no test API is invoked from background goroutines.
| if alias, _, ok := strings.Cut(serverName, "."); ok { | ||
| if target, found := m.routes.Load(alias); found { | ||
| return target, true | ||
| } | ||
| } | ||
| return m.routes.Load(serverName) |
There was a problem hiding this comment.
Prefer an exact SNI match before falling back to the first-label alias.
Right now api.example.test will route to alias api even if there is an exact api.example.test passthrough target. That makes full-name aliases unreachable whenever a matching first-label alias also exists.
🛠️ Proposed fix
if serverName == "" {
return nil, false
}
+ if target, found := m.routes.Load(serverName); found {
+ return target, true
+ }
if alias, _, ok := strings.Cut(serverName, "."); ok {
- if target, found := m.routes.Load(alias); found {
- return target, true
- }
+ return m.routes.Load(alias)
}
- return m.routes.Load(serverName)
+ return nil, false
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if alias, _, ok := strings.Cut(serverName, "."); ok { | |
| if target, found := m.routes.Load(alias); found { | |
| return target, true | |
| } | |
| } | |
| return m.routes.Load(serverName) | |
| if target, found := m.routes.Load(serverName); found { | |
| return target, true | |
| } | |
| if alias, _, ok := strings.Cut(serverName, "."); ok { | |
| return m.routes.Load(alias) | |
| } | |
| return nil, false |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/entrypoint/sni_passthrough.go` around lines 350 - 355, Change the
lookup order to try an exact SNI match before falling back to the first-label
alias: call m.routes.Load(serverName) first and if it returns found/true return
that target; only if the exact lookup fails, use strings.Cut(serverName, ".") to
extract alias and then attempt m.routes.Load(alias) (returning that if found).
Update the logic around serverName, strings.Cut and m.routes.Load to implement
this exact-then-alias lookup.
| build: func(args any) HandlerFunc { | ||
| name := args.(string) | ||
| return func(w *httputils.ResponseModifier, r *http.Request, upstream http.HandlerFunc) error { | ||
| handler, ok := GetHandler(name) | ||
| if !ok { | ||
| return fmt.Errorf("handler %q not found", name) | ||
| } | ||
| handler.ServeHTTP(w, r) | ||
| return errTerminateRule | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider validating handler existence at build time rather than runtime.
The handler lookup happens on every request via GetHandler(name). While this provides flexibility (handlers can be registered after rules are parsed), it means configuration errors (e.g., typo in handler name) won't be detected until a request actually matches the rule.
If handlers are expected to be registered before rules are loaded (which seems to be the case with init() in handler.go), you could move the lookup to the build function and capture the handler reference:
♻️ Suggested refactor for early validation
build: func(args any) HandlerFunc {
name := args.(string)
+ handler, ok := GetHandler(name)
+ if !ok {
+ // Return a handler that always errors
+ return func(w *httputils.ResponseModifier, r *http.Request, upstream http.HandlerFunc) error {
+ return fmt.Errorf("handler %q not found", name)
+ }
+ }
return func(w *httputils.ResponseModifier, r *http.Request, upstream http.HandlerFunc) error {
- handler, ok := GetHandler(name)
- if !ok {
- return fmt.Errorf("handler %q not found", name)
- }
handler.ServeHTTP(w, r)
return errTerminateRule
}
},Alternatively, add a validation step in validate that checks handler existence if early detection is preferred.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/route/rules/do.go` around lines 187 - 196, Perform the handler
lookup once during rule construction instead of on every request: inside the
build closure (the build: func(args any) HandlerFunc { ... }) cast name :=
args.(string), call handler, ok := GetHandler(name) there and capture the
returned handler variable; if ok is false return a HandlerFunc that immediately
returns fmt.Errorf("handler %q not found", name) (or alternatively add the same
GetHandler(name) check in the rule's validate function to fail early), and in
the normal case have the returned HandlerFunc call handler.ServeHTTP(w, r) and
return errTerminateRule so runtime per-request lookups are eliminated.



Peek ClientHello SNI before termination: tcp routes with sni_hosts proxy the byte stream to upstream; everything else is queued into normal HTTPS handling. When the SNI mux accepts, skip ACL and PROXY protocol on the inner HTTPS server.
Extend Route with sni_hosts, add ConnProxy to tcp stream forwarding, update config.example.yml and stream README, and cover behavior with unit and integration tests.
Summary by CodeRabbit
New Features
certificate_key_typesetting.handlecommand in rule system for dispatching to registered in-process handlers.Improvements