You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This pull request introduces support for provisioning Azure Linux nodes in the GPU Provisioner and updates related functionality across documentation, implementation, and testing. Key changes include adding configuration options for Azure Linux in KAITO workloads, modifying the logic for determining the OS SKU, and expanding test coverage to ensure compatibility.
Documentation Updates:
Added a new document, docs/azure-linux-support.md, detailing the configuration, supported image families, migration steps, and benefits of using Azure Linux with the GPU Provisioner. This includes examples for both NodeClaim and Workspace specifications.
Implementation Enhancements:
Updated func newAgentPoolObject in pkg/providers/instance/instance.go to determine the OS SKU based on NodeClaim labels or annotations, defaulting to Ubuntu if unspecified or invalid. Labels take precedence over annotations.
I considered doing this with the AKSNodeClass but opted for labels and annotations instead. Happy to change to looking at the AKSNodeClass reference for the ImageFamily.
Testing Improvements:
Added a new test, TestNewAgentPoolObjectWithImageFamily, in pkg/providers/instance/instance_test.go to validate the OS SKU determination logic for various scenarios, including label precedence, unknown image families, and default behavior.
The determineOSSKU function does not handle the case where nodeClaim is nil. This could lead to a panic if nodeClaim is unexpectedly nil.
funcdetermineOSSKU(nodeClaim*karpenterv1.NodeClaim) *armcontainerservice.OSSKU {
// Helper function to convert image family to OSSKUconvertImageFamilyToOSSKU:=func(imageFamily, sourcestring) *armcontainerservice.OSSKU {
switchstrings.ToLower(imageFamily) {
case"azurelinux":
returnto.Ptr(armcontainerservice.OSSKUAzureLinux)
case"ubuntu", "ubuntu2204":
returnto.Ptr(armcontainerservice.OSSKUUbuntu)
default:
klog.Warningf("Unknown imageFamily %s in NodeClaim %s, defaulting to Ubuntu", imageFamily, source)
returnto.Ptr(armcontainerservice.OSSKUUbuntu)
}
}
// First check for a direct label on the NodeClaimifimageFamily, ok:=nodeClaim.Labels["kaito.sh/node-image-family"]; ok {
returnconvertImageFamilyToOSSKU(imageFamily, "label")
}
// Check annotations as fallbackifimageFamily, ok:=nodeClaim.Annotations["kaito.sh/node-image-family"]; ok {
returnconvertImageFamilyToOSSKU(imageFamily, "annotation")
}
// Default to Ubuntu if no image family is specifiedreturnto.Ptr(armcontainerservice.OSSKUUbuntu)
}
There are multiple tests that check the deletion of nodes (It("should terminate node when delete triggered"), It("should terminate node when delete triggered (Azure Linux)"), It("should terminate node when delete triggered (Azure Linux - annotation)")). These tests seem redundant and could be consolidated to reduce duplication.
## Migration Guide
+### Update Existing Deployments++1. **Edit your NodeClaim or Workspace**:+ ```yaml+ # Add this label to your NodeClaim or Workspace labelSelector+ kaito.sh/node-image-family: "AzureLinux"+ ```++2. **Roll out the changes**:+ ```bash+ kubectl apply -f <your-nodeclaim-or-workspace-file>.yaml+ ```++3. **Verify the change**:+ ```bash+ # Check that new nodes use Azure Linux+ kubectl get nodes -o custom-columns=NAME:.metadata.name,OS-IMAGE:.status.nodeInfo.osImage+ ```++4. **Test your workloads**:+ - Ensure your containerized workloads work correctly on Azure Linux+ - Most workloads should work without changes++### From Ubuntu to Azure Linux+
Suggestion importance[1-10]: 5
__
Why: Including migration steps assists users in updating existing deployments to use Azure Linux.
Low
suhuruli
changed the title
Azure Linux support in gpu provisioner
feat: Azure Linux support in gpu provisioner
Jul 2, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces support for provisioning Azure Linux nodes in the GPU Provisioner and updates related functionality across documentation, implementation, and testing. Key changes include adding configuration options for Azure Linux in KAITO workloads, modifying the logic for determining the OS SKU, and expanding test coverage to ensure compatibility.
Documentation Updates:
docs/azure-linux-support.md, detailing the configuration, supported image families, migration steps, and benefits of using Azure Linux with the GPU Provisioner. This includes examples for bothNodeClaimandWorkspacespecifications.Implementation Enhancements:
func newAgentPoolObjectinpkg/providers/instance/instance.goto determine the OS SKU based onNodeClaimlabels or annotations, defaulting to Ubuntu if unspecified or invalid. Labels take precedence over annotations.Testing Improvements:
TestNewAgentPoolObjectWithImageFamily, inpkg/providers/instance/instance_test.goto validate the OS SKU determination logic for various scenarios, including label precedence, unknown image families, and default behavior.