-
Notifications
You must be signed in to change notification settings - Fork 484
Add GCP OAuth authentication support for GKE clusters #4218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add GCP OAuth authentication support for GKE clusters #4218
Conversation
|
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: drduker The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @drduker! |
32982e2 to
fc40cb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Google Cloud Platform OAuth 2.0 authentication support to Headlamp, providing a modern replacement for the deprecated GKE Identity Service. The implementation enables users to authenticate with their Google Cloud accounts when accessing Headlamp deployed on GKE clusters.
Key Changes:
- Implements RFC 7636-compliant PKCE OAuth 2.0 flow with Google as the identity provider
- Adds automatic GKE cluster detection based on server URL patterns
- Integrates OAuth flow into the authentication chooser UI with conditional rendering
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| frontend/src/lib/k8s/gke.ts | Implements GKE cluster detection and OAuth initiation utilities |
| frontend/src/lib/k8s/gke.test.ts | Comprehensive unit tests for GKE utilities |
| frontend/src/lib/k8s/cluster.ts | Adds optional server property to Cluster interface |
| frontend/src/components/cluster/GCPLoginButton.tsx | React component for Google sign-in button with conditional rendering |
| frontend/src/components/cluster/GCPLoginButton.test.tsx | Component tests for GCPLoginButton |
| frontend/src/components/authchooser/index.tsx | Integrates GCP OAuth option and disables auto-redirect to token page |
| backend/pkg/gcp/auth.go | Core OAuth 2.0 authenticator with PKCE, token refresh, and caching |
| backend/pkg/gcp/auth_test.go | Unit tests for GCP authenticator functions |
| backend/pkg/auth/gcp.go | HTTP handlers for OAuth login, callback, and token refresh flows |
| backend/pkg/config/config.go | Adds GCP OAuth configuration fields with validation |
| backend/cmd/server.go | Populates GCP OAuth configuration from config |
| backend/cmd/headlamp.go | Registers OAuth routes and clears in-cluster auth when GCP OAuth enabled |
| backend/go.mod | Adds cloud.google.com/go/compute/metadata dependency |
| backend/go.sum | Updates dependency checksums including golang.org/x/sys version bump |
| docs/GCP_OAUTH_GKE_SETUP.md | Comprehensive deployment and configuration guide |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
skoeva
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for looking into this.
I see the PR was generated with Claude, please make sure to thoroughly go through the Copilot comments, test your changes, and ensure tests pass before marking this PR ready for review.
|
Hi, thanks for looking into this. I see the PR was generated with Claude, please make sure to thoroughly go through the Copilot comments, test your changes, and ensure tests pass before marking this PR ready for review. Yes, was just trying to get it working, which it is. Not sure how much more time want to put into this yet. |
|
No worries! That's good to know. Feel free to ping if you would like someone else to take this over |
Ok, no worries. Thanks for sharing anyway. Let’s leave this open for a while longer. If someone else wants to pick this up they can :) Otherwise we can close this and it’s archived for anyone searching who may be interested. |
fc40cb0 to
9f382d6
Compare
|
Just signed the CLA thing. |
799c0b4 to
df28d25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 28 out of 31 changed files in this pull request and generated 3 comments.
Files not reviewed (1)
- frontend/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if cluster == "" { | ||
| http.Error(w, "cluster parameter required", http.StatusBadRequest) | ||
| return | ||
| } |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster parameter should be validated against validClusterNamePattern before storing it in a cookie. Currently, validation only happens in the callback handler, which means an attacker could inject malicious cluster names that get stored in cookies during the login flow. Add validation after line 50:
if !validClusterNamePattern.MatchString(cluster) {
http.Error(w, "invalid cluster name format", http.StatusBadRequest)
return
}| } | |
| } | |
| if !validClusterNamePattern.MatchString(cluster) { | |
| http.Error(w, "invalid cluster name format", http.StatusBadRequest) | |
| return | |
| } |
docs/GCP_OAUTH_GKE_SETUP.md
Outdated
| spec: | ||
| containers: | ||
| - name: headlamp | ||
| image: lucaspick/headlamp-gcp-oauth:v6 # Use your custom image |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation references a personal Docker image lucaspick/headlamp-gcp-oauth:v6. This should be updated to reference the official Headlamp image once this feature is merged, or provide instructions for users to build their own image. Consider using a placeholder like headlamp/headlamp:latest or add a note that users need to build from this branch.
| // HandleGCPAuthLogin initiates the GCP OAuth login flow for GKE clusters. | ||
| func HandleGCPAuthLogin(gcpAuth *gcp.GCPAuthenticator, baseURL string) http.HandlerFunc { | ||
| return func(w http.ResponseWriter, r *http.Request) { | ||
| cluster := r.URL.Query().Get("cluster") | ||
| if cluster == "" { | ||
| http.Error(w, "cluster parameter required", http.StatusBadRequest) | ||
| return | ||
| } | ||
|
|
||
| // Generate state token for CSRF protection | ||
| state, err := gcp.GenerateRandomState() | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, nil, err, "failed to generate state") | ||
| http.Error(w, "failed to generate state", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| // Generate PKCE code verifier and challenge for enhanced security | ||
| codeVerifier, err := gcp.GenerateCodeVerifier() | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, nil, err, "failed to generate code verifier") | ||
| http.Error(w, "failed to generate code verifier", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| codeChallenge := gcp.GenerateCodeChallenge(codeVerifier) | ||
|
|
||
| secure := IsSecureContext(r) | ||
|
|
||
| // Store state, cluster, and PKCE verifier in cookies for validation in callback | ||
| setOAuthCookie(w, gcpOAuthStateCookie, state, secure) | ||
| setOAuthCookie(w, gcpOAuthClusterCookie, cluster, secure) | ||
| setOAuthCookie(w, gcpOAuthVerifierCookie, codeVerifier, secure) | ||
|
|
||
| // Redirect to Google OAuth | ||
| authURL := gcpAuth.GetAuthCodeURL(state, codeChallenge) | ||
|
|
||
| logger.Log(logger.LevelInfo, map[string]string{ | ||
| "cluster": cluster, | ||
| }, nil, "initiating GCP OAuth flow") | ||
|
|
||
| http.Redirect(w, r, authURL, http.StatusFound) | ||
| } | ||
| } | ||
|
|
||
| // gcpCallbackData holds validated data from the OAuth callback. | ||
| type gcpCallbackData struct { | ||
| cluster string | ||
| codeVerifier string | ||
| code string | ||
| } | ||
|
|
||
| // validateGCPCallback validates the OAuth callback request and returns extracted data. | ||
| func validateGCPCallback(r *http.Request) (*gcpCallbackData, error) { | ||
| // Validate state token (CSRF protection) | ||
| stateCookie, err := r.Cookie(gcpOAuthStateCookie) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("state cookie not found: %w", err) | ||
| } | ||
|
|
||
| stateParam := r.URL.Query().Get("state") | ||
| if stateCookie.Value != stateParam { | ||
| return nil, fmt.Errorf("state mismatch: cookie=%s, param=%s", stateCookie.Value, stateParam) | ||
| } | ||
|
|
||
| // Get cluster from cookie | ||
| clusterCookie, err := r.Cookie(gcpOAuthClusterCookie) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("cluster cookie not found: %w", err) | ||
| } | ||
|
|
||
| cluster := clusterCookie.Value | ||
| if !validClusterNamePattern.MatchString(cluster) { | ||
| return nil, fmt.Errorf("invalid cluster name format: %s", cluster) | ||
| } | ||
|
|
||
| // Check for OAuth errors | ||
| if errParam := r.URL.Query().Get("error"); errParam != "" { | ||
| errDesc := r.URL.Query().Get("error_description") | ||
| return nil, fmt.Errorf("OAuth error: %s - %s", errParam, errDesc) | ||
| } | ||
|
|
||
| code := r.URL.Query().Get("code") | ||
| if code == "" { | ||
| return nil, fmt.Errorf("no code in request") | ||
| } | ||
|
|
||
| // Get PKCE code verifier (optional) | ||
| codeVerifier := "" | ||
| if verifierCookie, err := r.Cookie(gcpOAuthVerifierCookie); err == nil { | ||
| codeVerifier = verifierCookie.Value | ||
| } | ||
|
|
||
| return &gcpCallbackData{ | ||
| cluster: cluster, | ||
| codeVerifier: codeVerifier, | ||
| code: code, | ||
| }, nil | ||
| } | ||
|
|
||
| // HandleGCPAuthCallback handles the OAuth callback from Google. | ||
| func HandleGCPAuthCallback(gcpAuth *gcp.GCPAuthenticator, baseURL string) http.HandlerFunc { | ||
| return func(w http.ResponseWriter, r *http.Request) { | ||
| ctx := r.Context() | ||
|
|
||
| data, err := validateGCPCallback(r) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, nil, err, "OAuth callback validation failed") | ||
| http.Error(w, err.Error(), http.StatusBadRequest) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| token, err := gcpAuth.Exchange(ctx, data.code, data.codeVerifier) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, map[string]string{"cluster": data.cluster}, err, "failed to exchange code") | ||
| http.Error(w, "failed to exchange token", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| gkeToken, err := gcpAuth.GetGKEAccessToken(ctx, token) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, map[string]string{"cluster": data.cluster}, err, "failed to get GKE token") | ||
| http.Error(w, "failed to get GKE token", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| // Cache the refresh token (non-fatal if it fails) | ||
| if token.RefreshToken != "" { | ||
| if cacheErr := gcpAuth.CacheRefreshToken(ctx, data.cluster, gkeToken, token.RefreshToken); cacheErr != nil { | ||
| logger.Log(logger.LevelError, map[string]string{"cluster": data.cluster}, cacheErr, "failed to cache refresh token") | ||
| } | ||
| } | ||
|
|
||
| SetTokenCookie(w, r, data.cluster, gkeToken, baseURL) | ||
|
|
||
| secure := IsSecureContext(r) | ||
| clearOAuthCookie(w, gcpOAuthStateCookie, secure) | ||
| clearOAuthCookie(w, gcpOAuthClusterCookie, secure) | ||
| clearOAuthCookie(w, gcpOAuthVerifierCookie, secure) | ||
|
|
||
| logger.Log(logger.LevelInfo, map[string]string{"cluster": data.cluster}, nil, "GCP OAuth flow completed") | ||
|
|
||
| redirectURL := fmt.Sprintf("/#/c/%s", data.cluster) | ||
| if baseURL != "" { | ||
| redirectURL = "/" + baseURL + redirectURL | ||
| } | ||
|
|
||
| http.Redirect(w, r, redirectURL, http.StatusFound) | ||
| } | ||
| } | ||
|
|
||
| // HandleGCPTokenRefresh handles token refresh requests for GKE clusters. | ||
| func HandleGCPTokenRefresh(gcpAuth *gcp.GCPAuthenticator, baseURL string) http.HandlerFunc { | ||
| return func(w http.ResponseWriter, r *http.Request) { | ||
| ctx := r.Context() | ||
|
|
||
| cluster, token := ParseClusterAndToken(r) | ||
| if cluster == "" || token == "" { | ||
| http.Error(w, "cluster and token required", http.StatusBadRequest) | ||
| return | ||
| } | ||
|
|
||
| // Get cached refresh token | ||
| refreshToken, err := gcpAuth.GetCachedRefreshToken(ctx, cluster, token) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, map[string]string{ | ||
| "cluster": cluster, | ||
| }, err, "failed to get cached refresh token") | ||
| http.Error(w, "no refresh token available", http.StatusUnauthorized) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| // Refresh the token | ||
| newToken, err := gcpAuth.RefreshToken(ctx, refreshToken) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, map[string]string{ | ||
| "cluster": cluster, | ||
| }, err, "failed to refresh token") | ||
| http.Error(w, "failed to refresh token", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| // Get new GKE access token | ||
| newGKEToken, err := gcpAuth.GetGKEAccessToken(ctx, newToken) | ||
| if err != nil { | ||
| logger.Log(logger.LevelError, map[string]string{ | ||
| "cluster": cluster, | ||
| }, err, "failed to get new GKE access token") | ||
| http.Error(w, "failed to get new GKE token", http.StatusInternalServerError) | ||
|
|
||
| return | ||
| } | ||
|
|
||
| // Cache the new refresh token if we got one (non-fatal if it fails) | ||
| if newToken.RefreshToken != "" { | ||
| if cacheErr := gcpAuth.CacheRefreshToken(ctx, cluster, newGKEToken, newToken.RefreshToken); cacheErr != nil { | ||
| logger.Log(logger.LevelError, map[string]string{"cluster": cluster}, cacheErr, "failed to cache new refresh token") | ||
| } | ||
| } | ||
|
|
||
| // Set new token in cookie | ||
| SetTokenCookie(w, r, cluster, newGKEToken, baseURL) | ||
|
|
||
| logger.Log(logger.LevelInfo, map[string]string{ | ||
| "cluster": cluster, | ||
| }, nil, "token refreshed successfully") | ||
|
|
||
| w.WriteHeader(http.StatusOK) | ||
| _, _ = w.Write([]byte("token refreshed")) | ||
| } | ||
| } | ||
|
|
||
| // setOAuthCookie sets a temporary cookie for OAuth flow state. | ||
| func setOAuthCookie(w http.ResponseWriter, name, value string, secure bool) { | ||
| http.SetCookie(w, &http.Cookie{ | ||
| Name: name, | ||
| Value: value, | ||
| Path: "/", | ||
| MaxAge: int(oauthFlowTimeout.Seconds()), | ||
| HttpOnly: true, | ||
| Secure: secure, | ||
| SameSite: http.SameSiteLaxMode, | ||
| }) | ||
| } | ||
|
|
||
| // clearOAuthCookie clears a cookie by setting it to expire immediately. | ||
| func clearOAuthCookie(w http.ResponseWriter, name string, secure bool) { | ||
| http.SetCookie(w, &http.Cookie{ | ||
| Name: name, | ||
| Value: "", | ||
| Path: "/", | ||
| MaxAge: -1, | ||
| HttpOnly: true, | ||
| Secure: secure, | ||
| SameSite: http.SameSiteLaxMode, | ||
| }) | ||
| } |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GCP OAuth HTTP handlers (HandleGCPAuthLogin, HandleGCPAuthCallback, HandleGCPTokenRefresh) lack unit test coverage. While the underlying GCPAuthenticator has tests in backend/pkg/gcp/auth_test.go, the HTTP handler functions should also have tests to verify request validation, error handling, cookie management, and redirect logic. Consider adding tests in a new file backend/pkg/auth/gcp_test.go.
df28d25 to
72ef722
Compare
This implementation adds GCP OAuth 2.0 authentication to Headlamp, replacing the deprecated Identity Service for GKE. Users can authenticate with their Google Cloud account, and the authentication tokens are used to access Kubernetes resources with proper RBAC. Backend changes: - New GCP authenticator package with RFC 7636-compliant PKCE support - OAuth HTTP handlers for login, callback, and token refresh - Configuration via environment variables - Token caching and automatic refresh mechanisms - Input validation to prevent injection attacks Frontend changes: - GCPLoginButton component for Google sign-in - GKE cluster detection based on server URL patterns - Integration into existing authentication chooser UI - Comprehensive test coverage Documentation: - Complete setup guide for GKE deployments - RBAC configuration examples - Troubleshooting guide 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
72ef722 to
83732cd
Compare
- Fix golangci-lint wsl errors in gcp_test.go by adding blank lines before assignments that were cuddled with non-assignments - Restore accidentally removed redirect logic in AuthChooser that automatically redirects to the token page when a cluster requires token authentication without OIDC configured 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When GCP OAuth is configured, show the auth chooser dialog instead of automatically redirecting to the token page. This allows users to choose between Google Sign In and token authentication. The redirect to token page is now conditional on GCP OAuth being disabled, which preserves backward compatibility with e2e tests and non-GCP deployments. Also updated GCPLoginButton to only show when GCP OAuth is explicitly enabled via environment variable, not based on cluster type detection.
e6be69d to
7fd4a1b
Compare
Tests now mock isGCPOAuthEnabled to return true by default, and use waitFor for async state updates since the component checks OAuth status on mount.
This implementation adds Google Cloud Platform OAuth 2.0 authentication to Headlamp, providing a replacement for the deprecated Identity Service for GKE. Users can now authenticate with their Google Cloud accounts when accessing Headlamp deployed on GKE clusters.
Backend Changes
Frontend Changes
Key Features
Documentation
Testing
This implementation has been tested and verified to work with GKE clusters, including successful OAuth flow initiation and PKCE code challenge generation.
Summary
This PR adds/fixes [feature/bug] by [brief description of what the change does].
Related Issue
Fixes #ISSUE_NUMBER
Changes
Steps to Test
Screenshots (if applicable)
Notes for the Reviewer