Add flex node added into / remove from Private AKS cluster#55
Add flex node added into / remove from Private AKS cluster#55weiliu2dev wants to merge 5 commits intoAzure:mainfrom
Conversation
- Add private-join command to join Private AKS cluster via Gateway - Add private-leave command with --mode=local|full cleanup options - Add private-install.sh and private-uninstall.sh scripts - Add pkg/privatecluster package with embedded scripts - Add documentation for creating and configuring Private AKS cluster
d4d925c to
69604af
Compare
| @@ -0,0 +1,796 @@ | |||
| #!/bin/bash | |||
There was a problem hiding this comment.
We don't do shell script except for the agent installation/uninstallation step, can you change everything to golang?
There was a problem hiding this comment.
agree with you. Ack: This is a POC to show the process for the node added into a private cluster - I'll create a tracking issue to convert shell to Go before GA. The only shell that will remain is the Arc agent installation (official script).
| NC='\033[0m' # No Color | ||
|
|
||
| # Configuration | ||
| GATEWAY_NAME="wg-gateway" |
There was a problem hiding this comment.
Please define a new struct for any new configuration needed in pkg/config/structs.go
There was a problem hiding this comment.
yes, will address both together. Since this is a POC, I'll convert shell to Go and define the GatewayConfig struct in a follow-up PR.
commands.go
Outdated
| // NewPrivateJoinCommand creates a new private-join command | ||
| func NewPrivateJoinCommand() *cobra.Command { | ||
| cmd := &cobra.Command{ | ||
| Use: "private-join", |
There was a problem hiding this comment.
Please do not introduce new commands. The aks-flex-node agent --config command handles both node bootstrapping and self-healing and is also the command used by the systemd service. For private clusters, continue using this same command and add a new configuration property as needed.
Similarly, for unbootstrapping, continue using the existing aks-flex-node unbootstrap command, with the appropriate configuration for private clusters.
There was a problem hiding this comment.
ok , I will remove the commands. Note that without the commands, the inline help will no longer be
available
There was a problem hiding this comment.
commands are already removed and merged into agent command.
| fi | ||
|
|
||
| # Remove Arc Agent and Azure resource | ||
| log_info "Removing Arc Agent..." |
There was a problem hiding this comment.
Have many things duplicated with scripts/uninstall.sh, just add a new uninstaller in /pkg/bootstrapper/bootstrapper.go to take of setting/cleaning up for anything related to private cluster.
There was a problem hiding this comment.
gosec found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
ffdc990 to
bd3daff
Compare
ad84d14 to
81f6179
Compare
81f6179 to
204ad26
Compare
|
@microsoft-github-policy-service agree company="Microsoft" |
|
I suggest spending a bit more time reviewing our codebase before starting to contribute. Thanks! |
|
|
||
| // Step 1: Clean up local agent state | ||
| i.logger.Info("Cleaning up local agent state...") | ||
| disconnectCmd := exec.CommandContext(ctx, "azcmagent", "disconnect", "--force-local-only") |
There was a problem hiding this comment.
use utils.RunCommandWithOutput() so it automatically add sudo if needed. Same with every place use exec.CommandContext below.
| } | ||
|
|
||
| // InstallJQ installs jq locally | ||
| func InstallJQ(ctx context.Context, logger *Logger) error { |
There was a problem hiding this comment.
| type TargetClusterConfig struct { | ||
| ResourceID string `json:"resourceId"` // Full resource ID of the target AKS cluster | ||
| Location string `json:"location"` // Azure region of the cluster (e.g., "eastus", "westus2") | ||
| Private bool `json:"private"` // Whether this is a private AKS cluster (requires Gateway/VPN setup) |
There was a problem hiding this comment.
rename to IsPrivateCluster
|
|
||
| // CheckInstalled verifies Azure CLI is installed | ||
| func (az *AzureCLI) CheckInstalled() error { | ||
| if !CommandExists("az") { |
There was a problem hiding this comment.
The installation of az is enforced in install.sh, but CLI credential isn't the only credential we support, customer can also use SP or MSI, so we shouldn't always require az login
|
|
||
| // AKSClusterExists checks if an AKS cluster exists | ||
| func (az *AzureCLI) AKSClusterExists(ctx context.Context, resourceGroup, clusterName string) bool { | ||
| return RunCommandSilent(ctx, "az", "aks", "show", |
There was a problem hiding this comment.
Do not use any az command for resource management. Use track2 SDK client. Since we support multiple auth methods. read https://github.com/wenxuan0923/AKSFlexNode/blob/main/pkg/auth/auth.go
Please convert all the commands in this file to use SDK client. FYI something like this https://github.com/wenxuan0923/AKSFlexNode/blob/6bdb4d86237cacb426e363cc008f3211441b563b/pkg/components/arc/arc_base.go#L42
| } | ||
|
|
||
| // IsRoot checks if the current process is running as root | ||
| func IsRoot() bool { |
There was a problem hiding this comment.
No it should not run as root.
| } | ||
|
|
||
| // RemoveHostsEntries removes entries matching a pattern from /etc/hosts | ||
| func RemoveHostsEntries(pattern string) error { |
| ) | ||
|
|
||
| // ScriptRunner provides backward compatibility (Deprecated: use Installer/Uninstaller directly) | ||
| type ScriptRunner struct { |
There was a problem hiding this comment.
What is this for? we only need installer and uninsaller in bootstrapper.go
https://github.com/wenxuan0923/AKSFlexNode/blob/main/pkg/bootstrapper/bootstrapper.go
| @@ -0,0 +1,96 @@ | |||
| package privatecluster | |||
There was a problem hiding this comment.
All the config should be put inside https://github.com/wenxuan0923/AKSFlexNode/blob/main/pkg/config/structs.go
| ) | ||
|
|
||
| // Logger provides colored logging for the private cluster operations | ||
| type Logger struct { |
There was a problem hiding this comment.
Why need this? we already have logger set up at agent level?
| i.clusterInfo.PrivateFQDN = clusterInfo.PrivateFQDN | ||
| i.logger.Success("AKS cluster: %s (AAD/RBAC enabled)", i.clusterInfo.ClusterName) | ||
|
|
||
| vnetName, vnetRG, err := i.azure.GetVNetInfo(ctx, i.clusterInfo.NodeResourceGroup) |
There was a problem hiding this comment.
You are assuming VNet will always be in Node reource group which is not true for BYO VNet
| if err := InstallJQ(ctx, i.logger); err != nil { | ||
| return fmt.Errorf("failed to install jq: %w", err) | ||
| } | ||
| if !CommandExists("kubectl") || !CommandExists("kubelogin") { |
There was a problem hiding this comment.
Why install kubectl and kubelogin here?? kubectl already being installed by kube_binaries_installer.go
| u.removeNodeFromCluster(ctx, hostname) | ||
|
|
||
| // Stop any running aks-flex-node agent process | ||
| u.stopFlexNodeAgent(ctx) |
There was a problem hiding this comment.
Why stopFlexNodeAgent in privatecluster uninstaller? you should only clean up everything related to your own component.
| } | ||
|
|
||
| // removeArcAgent removes Azure Arc agent | ||
| func (u *Uninstaller) removeArcAgent(ctx context.Context, nodeName string) { |
There was a problem hiding this comment.
This is done by arc uninstaller already.
Usage