feat: Support for GKE private clusters without default node pool #2408

ciroclover · 2025-08-15T19:26:14Z

The diff shows a massive difference due to the "dynamic" block indentation.

The changes are:

Ensure that we define the initial_node_count when is not defined in the node-pool [ref].
Make the node-pool block dynamic. Add the conditional to ensure that the block is not generated when node_pools = [] [ref]
Fix the cluster_name variables in the main.tf. I honestly believe that we could just use cluster_name_computed = var.name, however, more tests should be needed. [ref]

google-cla · 2025-08-15T19:26:19Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

m0ps · 2025-08-18T04:39:20Z

@ciroclover, I think, in such an arrangement, it would never be accepted. First of all, you have to use autogen to generate all submodules automatically. Also, don't forget to generate docs and perform proper linting... I would recommend you make yourself familiar with the contribution guide

m0ps · 2025-08-22T09:13:55Z

@ciroclover

Note: The correct sequence to update the repo using autogen functionality is to run make build. This will create the various Terraform files, and then generate the Terraform documentation using terraform-docs.

ref: https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/main/CONTRIBUTING.md#templating

ciroclover · 2025-09-08T16:49:52Z

Rebased with the main branch, fixed the template conflict, and ran make build. Any other dependencies before we get this one reviewed?

apeabody · 2025-09-08T19:53:15Z

/gcbrun

mzglinski · 2025-10-28T14:32:19Z

Hi, I would also benefit from this PR. Is there anything that is stopping it from further review?

apeabody · 2025-10-30T23:06:39Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for creating GKE clusters without a default node pool by making the node_pool block dynamic and adjusting related logic. The changes correctly handle the case where no node pools are defined. However, I've identified a critical pre-existing issue in how the inline default node pool is configured, which this PR touches. It incorrectly uses configuration from the first user-defined node pool (var.node_pools[0]), which can lead to misconfigurations. I've provided a detailed comment on this with recommendations for a safer approach. Additionally, I've suggested a simplification for computing the cluster name, which you also hinted at in your description.

modules/beta-public-cluster-update-variant/cluster.tf

modules/beta-public-cluster/cluster.tf

cluster.tf

modules/private-cluster/cluster.tf

modules/beta-private-cluster-update-variant/cluster.tf

modules/beta-public-cluster/main.tf

modules/beta-private-cluster-update-variant/main.tf

modules/private-cluster-update-variant/main.tf

main.tf

modules/private-cluster/main.tf

apeabody · 2025-10-30T23:20:00Z

/gcbrun

sergeyshevch · 2025-11-17T15:36:22Z

@apeabody Do you plan to merge this one? We also trying to migrate onto computeclasses and now it's not possible to create empty GKE public cluster

apeabody · 2025-11-17T16:32:22Z

/gemini review

apeabody · 2025-11-17T16:32:36Z

/gcbrun

gemini-code-assist

Code Review

This pull request aims to add support for GKE private clusters without a default node pool. The changes involve making the node_pool block within the google_container_cluster resource dynamic, adjusting initial_node_count logic, and simplifying how the cluster name is derived for outputs. While the changes are mostly correct and improve the module's flexibility, I've found a critical issue in the condition for the dynamic node_pool block. This issue would cause Terraform to fail in scenarios where remove_default_node_pool is set to true while node_pools is not empty. My review includes suggestions to fix this logic across all affected files.

autogen/main/cluster.tf.tmpl

cluster.tf

modules/beta-private-cluster-update-variant/cluster.tf

modules/beta-private-cluster/cluster.tf

modules/beta-public-cluster-update-variant/cluster.tf

modules/beta-public-cluster/cluster.tf

modules/private-cluster-update-variant/cluster.tf

modules/private-cluster/cluster.tf

…e-pool

ciroclover · 2025-11-18T13:37:18Z

/gemini review

gemini-code-assist

Code Review

The pull request successfully implements support for GKE private clusters without a default node pool. The changes correctly make the initial_node_count and the default node_pool block conditional based on whether node pools are explicitly defined or if the default node pool is to be removed. Additionally, the cluster_name_computed logic has been simplified to directly use var.name, which is a more robust approach when the existence of a default node pool is not guaranteed. The changes align well with the stated objective of the pull request.

ciroclover · 2025-11-18T13:47:41Z

@apeabody I think it's good to go! 🚀

apeabody · 2025-11-18T18:28:06Z

/gcbrun

apeabody · 2025-11-18T19:00:54Z

@apeabody I think it's good to go! 🚀

Thanks @ciroclover - From the test:

TestSimpleRegionalClusterAutoscaling 2025-11-18T18:48:37Z retry.go:99: Returning due to fatal error: FatalError{Underlying: error while running command: exit status 1; 
Error: googleapi: Error 400: Cluster.initial_node_count must be greater than zero.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0xc34958235c4e21d9"
  }
]
, badRequest

  with module.example.module.gke.google_container_cluster.primary,
  on ../../../cluster.tf line 22, in resource "google_container_cluster" "primary":
  22: resource "google_container_cluster" "primary" {
}

…s not enabled

ciroclover · 2025-12-09T14:10:23Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for GKE clusters without a default node pool by making the node_pool block dynamic and adjusting the initial_node_count logic. It also includes a good fix for how cluster_name_computed is derived, making it more robust. The changes are well-aligned with the goal of the PR. I've added a couple of suggestions to improve the readability of the new conditional logic, which is a bit complex.

autogen/main/cluster.tf.tmpl

cluster.tf

modules/beta-private-cluster-update-variant/cluster.tf

modules/beta-public-cluster/cluster.tf

modules/private-cluster-update-variant/cluster.tf

modules/private-cluster/cluster.tf

ciroclover · 2025-12-09T14:42:14Z

/gemini review

gemini-code-assist

Code Review

This pull request aims to add support for GKE clusters without a default node pool. The simplification of the cluster_name_computed local is a good improvement. However, I've identified two significant issues with the implementation. First, there's a critical logic flaw in how initial_node_count is calculated, which can lead to invalid API requests when remove_default_node_pool is true. Second, the configuration of the default node pool is incorrectly tied to the first user-defined node pool, which is a major design flaw that can cause unexpected behavior. I've provided detailed comments and suggestions to address these issues.

autogen/main/cluster.tf.tmpl

ciroclover · 2025-12-09T15:16:05Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for GKE clusters without a default node pool, which is a valuable enhancement for flexibility. The changes are well-implemented across the various module variants. The core logic involves making the default node_pool block dynamic and conditionally setting initial_node_count. Additionally, the simplification of cluster_name_computed to remove its dependency on node pools is a solid improvement that increases the module's robustness. My review includes a few suggestions to improve the readability of some conditional expressions, but overall, the changes are correct and well-executed.

autogen/main/cluster.tf.tmpl

cluster.tf

modules/beta-private-cluster-update-variant/cluster.tf

modules/beta-public-cluster/cluster.tf

modules/private-cluster-update-variant/cluster.tf

modules/private-cluster/cluster.tf

ciroclover · 2025-12-09T15:47:05Z

/gemini review

gemini-code-assist

Code Review

The pull request introduces support for GKE private clusters without a default node pool, dynamically managing the default node pool block, and simplifying the cluster_name_computed variable. The changes aim to provide more flexibility in cluster configuration, particularly for scenarios where a default node pool is not desired or custom node pools are exclusively used. However, there is a logical issue in the conditional rendering of the dynamic "node_pool" block that could lead to unintended resource creation.

gemini-code-assist · 2025-12-09T15:54:34Z

autogen/main/cluster.tf.tmpl

  }

  {% if autopilot_cluster != true %}
+  initial_node_count  = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)


The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 649-767) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

gemini-code-assist · 2025-12-09T15:54:34Z

autogen/main/cluster.tf.tmpl

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]


The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 48, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

gemini-code-assist · 2025-12-09T15:54:34Z

cluster.tf

    }
  }

+  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)


The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

gemini-code-assist · 2025-12-09T15:54:34Z

cluster.tf

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]


The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

gemini-code-assist · 2025-12-09T15:54:34Z

modules/beta-private-cluster-update-variant/cluster.tf

    }
  }

+  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)


The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 543-660) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

gemini-code-assist · 2025-12-09T15:54:34Z

modules/beta-public-cluster/cluster.tf

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]


The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

gemini-code-assist · 2025-12-09T15:54:34Z

modules/private-cluster-update-variant/cluster.tf

    }
  }

+  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)


The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

gemini-code-assist · 2025-12-09T15:54:34Z

modules/private-cluster-update-variant/cluster.tf

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]


The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

gemini-code-assist · 2025-12-09T15:54:34Z

modules/private-cluster/cluster.tf

    }
  }

+  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)


The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

gemini-code-assist · 2025-12-09T15:54:35Z

modules/private-cluster/cluster.tf

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]


The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

ciroclover · 2025-12-09T16:10:13Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for creating GKE clusters without a default node pool by making the node_pool block dynamic. It also simplifies how the cluster_name_computed local is derived. The changes are generally in the right direction.

However, I've identified a critical issue in the implementation of the dynamic default node pool. The configuration for this pool incorrectly attempts to access var.node_pools[0], which is guaranteed to be empty when this block is created, and will cause Terraform to fail. Additionally, the initial_node_count for this default pool is not correctly handled and could lead to an invalid configuration.

I've added detailed comments on all affected cluster.tf files with suggestions for fixing these issues. Please address these points to ensure the module works as expected.

modules/beta-public-cluster-update-variant/cluster.tf

autogen/main/cluster.tf.tmpl

modules/private-cluster/cluster.tf

cluster.tf

modules/private-cluster-update-variant/cluster.tf

modules/beta-private-cluster/cluster.tf

modules/private-cluster-update-variant/cluster.tf

modules/beta-private-cluster-update-variant/cluster.tf

modules/private-cluster/cluster.tf

cluster.tf

ciroclover · 2025-12-09T22:32:57Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces support for GKE clusters without a default node pool by making the node_pool block dynamic and adjusting related logic. This is a great feature enhancement. However, I've found a critical issue in the implementation of the dynamic default node pool that will cause Terraform to fail when no node pools are specified. The configuration for the default node pool incorrectly attempts to access var.node_pools[0], which is not available in that context. I've provided a detailed comment with a code suggestion to fix this issue by using default values and also to address an inconsistency in the naming of the default node pool. Please review the suggested changes carefully.

gemini-code-assist · 2025-12-09T22:35:44Z

autogen/main/cluster.tf.tmpl

+  dynamic "node_pool" {
+    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])
+    content {
+      name               = "default-pool"
+      initial_node_count = max(1, var.initial_node_count)
+
+      management {
+        auto_repair  = lookup(var.cluster_autoscaling, "auto_repair", true)
+        auto_upgrade = lookup(var.cluster_autoscaling, "auto_upgrade", true)
      }

-      dynamic "gvnic" {
-        for_each = lookup(var.node_pools[0], "enable_gvnic", false) ? [true] : []
-        content {
-          enabled = gvnic.value
+      node_config {
+        image_type                  = lookup(var.node_pools[0], "image_type", "COS_CONTAINERD")
+        machine_type                = lookup(var.node_pools[0], "machine_type", "e2-medium")
+        min_cpu_platform            = lookup(var.node_pools[0], "min_cpu_platform", "")
+        enable_confidential_storage = lookup(var.node_pools[0], "enable_confidential_storage", false)
+        disk_type                   = lookup(var.node_pools[0], "disk_type", null)
+        dynamic "gcfs_config" {
+          for_each = lookup(var.node_pools[0], "enable_gcfs", null) != null ? [var.node_pools[0].enable_gcfs] : []
+          content {
+            enabled = gcfs_config.value
+          }
        }
-      }

-      dynamic "fast_socket" {
-        for_each = lookup(var.node_pools[0], "enable_fast_socket", null) != null ? [var.node_pools[0].enable_fast_socket] : []
-        content {
-          enabled = fast_socket.value
+        dynamic "gvnic" {
+          for_each = lookup(var.node_pools[0], "enable_gvnic", false) ? [true] : []
+          content {
+            enabled = gvnic.value
+          }
        }
-      }

-      dynamic "kubelet_config" {
-        for_each = length(setintersection(
-          keys(var.node_pools[0]),
-          ["cpu_manager_policy", "cpu_cfs_quota", "cpu_cfs_quota_period", "insecure_kubelet_readonly_port_enabled", "pod_pids_limit", "container_log_max_size", "container_log_max_files", "image_gc_low_threshold_percent", "image_gc_high_threshold_percent", "image_minimum_gc_age", "image_maximum_gc_age", "allowed_unsafe_sysctls"]
-        )) != 0 || var.insecure_kubelet_readonly_port_enabled != null ? [1] : []
+        dynamic "fast_socket" {
+          for_each = lookup(var.node_pools[0], "enable_fast_socket", null) != null ? [var.node_pools[0].enable_fast_socket] : []
+          content {
+            enabled = fast_socket.value
+          }
+        }

-        content {
-          cpu_manager_policy                     = lookup(var.node_pools[0], "cpu_manager_policy", "static")
-          cpu_cfs_quota                          = lookup(var.node_pools[0], "cpu_cfs_quota", null)
-          cpu_cfs_quota_period                   = lookup(var.node_pools[0], "cpu_cfs_quota_period", null)
-          insecure_kubelet_readonly_port_enabled = lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled) != null ? upper(tostring(lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled))) : null
-          pod_pids_limit                         = lookup(var.node_pools[0], "pod_pids_limit", null)
-          container_log_max_size                 = lookup(var.node_pools[0], "container_log_max_size", null)
-          container_log_max_files                = lookup(var.node_pools[0], "container_log_max_files", null)
-          image_gc_low_threshold_percent         = lookup(var.node_pools[0], "image_gc_low_threshold_percent", null)
-          image_gc_high_threshold_percent        = lookup(var.node_pools[0], "image_gc_high_threshold_percent", null)
-          image_minimum_gc_age                   = lookup(var.node_pools[0], "image_minimum_gc_age", null)
-          image_maximum_gc_age                   = lookup(var.node_pools[0], "image_maximum_gc_age", null)
-          allowed_unsafe_sysctls                 = lookup(var.node_pools[0], "allowed_unsafe_sysctls", null) == null ? null : [for s in split(",", lookup(var.node_pools[0], "allowed_unsafe_sysctls", null)) : trimspace(s)]
+        dynamic "kubelet_config" {
+          for_each = length(setintersection(
+            keys(var.node_pools[0]),
+            ["cpu_manager_policy", "cpu_cfs_quota", "cpu_cfs_quota_period", "insecure_kubelet_readonly_port_enabled", "pod_pids_limit", "container_log_max_size", "container_log_max_files", "image_gc_low_threshold_percent", "image_gc_high_threshold_percent", "image_minimum_gc_age", "image_maximum_gc_age", "allowed_unsafe_sysctls"]
+          )) != 0 || var.insecure_kubelet_readonly_port_enabled != null ? [1] : []
+
+          content {
+            cpu_manager_policy                     = lookup(var.node_pools[0], "cpu_manager_policy", "static")
+            cpu_cfs_quota                          = lookup(var.node_pools[0], "cpu_cfs_quota", null)
+            cpu_cfs_quota_period                   = lookup(var.node_pools[0], "cpu_cfs_quota_period", null)
+            insecure_kubelet_readonly_port_enabled = lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled) != null ? upper(tostring(lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled))) : null
+            pod_pids_limit                         = lookup(var.node_pools[0], "pod_pids_limit", null)
+            container_log_max_size                 = lookup(var.node_pools[0], "container_log_max_size", null)
+            container_log_max_files                = lookup(var.node_pools[0], "container_log_max_files", null)
+            image_gc_low_threshold_percent         = lookup(var.node_pools[0], "image_gc_low_threshold_percent", null)
+            image_gc_high_threshold_percent        = lookup(var.node_pools[0], "image_gc_high_threshold_percent", null)
+            image_minimum_gc_age                   = lookup(var.node_pools[0], "image_minimum_gc_age", null)
+            image_maximum_gc_age                   = lookup(var.node_pools[0], "image_maximum_gc_age", null)
+            allowed_unsafe_sysctls                 = lookup(var.node_pools[0], "allowed_unsafe_sysctls", null) == null ? null : [for s in split(",", lookup(var.node_pools[0], "allowed_unsafe_sysctls", null)) : trimspace(s)]
+          }
        }
-      }

-      dynamic "sole_tenant_config" {
-        # node_affinity is currently the only member of sole_tenant_config
-        for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [true] : []
-        content {
-          dynamic "node_affinity" {
-            for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [lookup(var.node_pools[0], "node_affinity", null)] : []
-            content {
-              key = lookup(jsondecode(node_affinity.value), "key", null)
-              operator = lookup(jsondecode(node_affinity.value), "operator", null)
-              values = lookup(jsondecode(node_affinity.value), "values", [])
+        dynamic "sole_tenant_config" {
+          # node_affinity is currently the only member of sole_tenant_config
+          for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [true] : []
+          content {
+            dynamic "node_affinity" {
+              for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [lookup(var.node_pools[0], "node_affinity", null)] : []
+              content {
+                key = lookup(jsondecode(node_affinity.value), "key", null)
+                operator = lookup(jsondecode(node_affinity.value), "operator", null)
+                values = lookup(jsondecode(node_affinity.value), "values", [])
+              }
            }
          }
        }
-      }

-      service_account = lookup(var.node_pools[0], "service_account", local.service_account)
+        service_account = lookup(var.node_pools[0], "service_account", local.service_account)

-      tags = concat(
-        lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [],
-        lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-pool"] : [],
-        lookup(local.node_pools_tags, "all", []),
-        lookup(local.node_pools_tags, var.node_pools[0].name, []),
-      )
+        tags = concat(
+          lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [],
+          lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-pool"] : [],
+          lookup(local.node_pools_tags, "all", []),
+          lookup(local.node_pools_tags, var.node_pools[0].name, []),
+        )

-      logging_variant = lookup(var.node_pools[0], "logging_variant", "DEFAULT")
+        logging_variant = lookup(var.node_pools[0], "logging_variant", "DEFAULT")

-      dynamic "workload_metadata_config" {
-        for_each = local.cluster_node_metadata_config
+        dynamic "workload_metadata_config" {
+          for_each = local.cluster_node_metadata_config

-        content {
-          mode = workload_metadata_config.value.mode
+          content {
+            mode = workload_metadata_config.value.mode
+          }
        }
-      }

-      metadata = local.node_pools_metadata["all"]
+        metadata = local.node_pools_metadata["all"]

-      {% if beta_cluster %}
-      dynamic "sandbox_config" {
-        for_each = tobool((lookup(var.node_pools[0], "sandbox_enabled", var.sandbox_enabled))) ? ["gvisor"] : []
-        content {
-          sandbox_type = sandbox_config.value
+        {% if beta_cluster %}
+        dynamic "sandbox_config" {
+          for_each = tobool((lookup(var.node_pools[0], "sandbox_enabled", var.sandbox_enabled))) ? ["gvisor"] : []
+          content {
+            sandbox_type = sandbox_config.value
+          }
        }
-      }

-      {% endif %}
-      boot_disk_kms_key = lookup(var.node_pools[0], "boot_disk_kms_key", var.boot_disk_kms_key)
+        {% endif %}
+        boot_disk_kms_key = lookup(var.node_pools[0], "boot_disk_kms_key", var.boot_disk_kms_key)

-      storage_pools = lookup(var.node_pools[0], "storage_pools", null) != null ? [var.node_pools[0].storage_pools] : []
+        storage_pools = lookup(var.node_pools[0], "storage_pools", null) != null ? [var.node_pools[0].storage_pools] : []

-      shielded_instance_config {
-        enable_secure_boot          = lookup(var.node_pools[0], "enable_secure_boot", false)
-        enable_integrity_monitoring = lookup(var.node_pools[0], "enable_integrity_monitoring", true)
-      }
+        shielded_instance_config {
+          enable_secure_boot          = lookup(var.node_pools[0], "enable_secure_boot", false)
+          enable_integrity_monitoring = lookup(var.node_pools[0], "enable_integrity_monitoring", true)
+        }

-      local_ssd_encryption_mode = lookup(var.node_pools[0], "local_ssd_encryption_mode", null)
-      max_run_duration          = lookup(var.node_pools[0], "max_run_duration", null)
-      flex_start                = lookup(var.node_pools[0], "flex_start", null)
+        local_ssd_encryption_mode = lookup(var.node_pools[0], "local_ssd_encryption_mode", null)
+        max_run_duration          = lookup(var.node_pools[0], "max_run_duration", null)
+        flex_start                = lookup(var.node_pools[0], "flex_start", null)
+      }
    }
  }


This dynamic node_pool block has two issues:

Critical Bug: The configuration for this default node pool is created only when length(var.node_pools) == 0, but it incorrectly tries to access var.node_pools[0] for many of its attributes. This will cause a fatal error during terraform plan or apply. The configuration for this default node pool should use default values instead of trying to look them up from an empty list.

Inconsistency: The name of this default node pool is hardcoded to default-pool, while other parts of the module (like local.node_pools_tags) expect default-node-pool. This will lead to incorrect configurations (e.g., tags not being applied).

I've provided a suggestion that fixes both issues by correcting the name and removing the invalid references to var.node_pools[0].

dynamic "node_pool" { for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : []) content { name = "default-node-pool" initial_node_count = max(1, var.initial_node_count) management { auto_repair = lookup(var.cluster_autoscaling, "auto_repair", true) auto_upgrade = lookup(var.cluster_autoscaling, "auto_upgrade", true) } node_config { image_type = "COS_CONTAINERD" machine_type = "e2-medium" min_cpu_platform = "" enable_confidential_storage = false disk_type = null dynamic "gcfs_config" { for_each = [] content { enabled = gcfs_config.value } } dynamic "gvnic" { for_each = [] content { enabled = gvnic.value } } dynamic "fast_socket" { for_each = [] content { enabled = fast_socket.value } } dynamic "kubelet_config" { for_each = var.insecure_kubelet_readonly_port_enabled != null ? [1] : [] content { cpu_manager_policy = "static" cpu_cfs_quota = null cpu_cfs_quota_period = null insecure_kubelet_readonly_port_enabled = var.insecure_kubelet_readonly_port_enabled != null ? upper(tostring(var.insecure_kubelet_readonly_port_enabled)) : null pod_pids_limit = null container_log_max_size = null container_log_max_files = null image_gc_low_threshold_percent = null image_gc_high_threshold_percent = null image_minimum_gc_age = null image_maximum_gc_age = null allowed_unsafe_sysctls = null } } dynamic "sole_tenant_config" { # node_affinity is currently the only member of sole_tenant_config for_each = [] content { dynamic "node_affinity" { for_each = [] content { key = null operator = null values = [] } } } } service_account = local.service_account tags = concat( lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [], lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-node-pool"] : [], lookup(local.node_pools_tags, "all", []), lookup(local.node_pools_tags, "default-node-pool", []), ) logging_variant = "DEFAULT" dynamic "workload_metadata_config" { for_each = local.cluster_node_metadata_config content { mode = workload_metadata_config.value.mode } } metadata = local.node_pools_metadata["all"] {% if beta_cluster %} dynamic "sandbox_config" { for_each = tobool(var.sandbox_enabled) ? ["gvisor"] : [] content { sandbox_type = sandbox_config.value } } {% endif %} boot_disk_kms_key = var.boot_disk_kms_key storage_pools = null shielded_instance_config { enable_secure_boot = false enable_integrity_monitoring = true } local_ssd_encryption_mode = null max_run_duration = null flex_start = null } } }

ciroclover requested review from a team, apeabody and ericyz as code owners August 15, 2025 19:26

ciroclover mentioned this pull request Aug 15, 2025

Support for GKE clusters with no default node pools #2407

Open

ciroclover force-pushed the private-cluster-node-pool branch from 390dcc1 to d985e40 Compare August 20, 2025 13:46

ciroclover force-pushed the private-cluster-node-pool branch from cbab4aa to 3800ac7 Compare September 8, 2025 13:50

ciroclover closed this Sep 8, 2025

ciroclover force-pushed the private-cluster-node-pool branch from 3800ac7 to eeaf95d Compare September 8, 2025 13:55

ciroclover added 2 commits September 8, 2025 10:33

feat: Support for GKE private clusters without default node pool

4eddbb7

Run autogen and linter

d4de152

ciroclover reopened this Sep 8, 2025

Merge branch 'main' into private-cluster-node-pool

ea54925

Merge branch 'main' into private-cluster-node-pool

93aeeea

gemini-code-assist bot reviewed Oct 30, 2025

View reviewed changes

Simplify how cluster_name_computed is generated

a745e01

Merge branch 'main' into private-cluster-node-pool

d719cc6

gemini-code-assist bot reviewed Nov 17, 2025

View reviewed changes

ciroclover and others added 2 commits November 18, 2025 07:30

Merge branch 'terraform-google-modules:main' into private-cluster-nod…

7161fb6

…e-pool

Fix remove_default_node_pool conflict in node_pool generation

1f9c0d2

gemini-code-assist bot reviewed Nov 18, 2025

View reviewed changes

Ensure that initial node count is greather than zero when autopilot i…

b2ac429

…s not enabled

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

ciroclover and others added 2 commits December 9, 2025 09:29

Improve expression readability

9bb1b1f

Merge branch 'main' into private-cluster-node-pool

0e1d5bd

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

autogen/main/cluster.tf.tmpl Outdated Show resolved Hide resolved

autogen/main/cluster.tf.tmpl Show resolved Hide resolved

Fix logic for initial_node_count for not autopilot clusters

677fe4b

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

Ensure initial_node_count compatibility with remove_default_node_pool

56759d1

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

resolve initial_node_count and default pool creation conflicts

170af42

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

resolve default node pool initial count configuration conflicts

f9c0edd

gemini-code-assist bot reviewed Dec 9, 2025

View reviewed changes

		dynamic "node_pool" {
		for_each = var.remove_default_node_pool \|\| length(var.node_pools) == 0 ? [] : [1]

feat: Support for GKE private clusters without default node pool #2408

Are you sure you want to change the base?

feat: Support for GKE private clusters without default node pool #2408

Conversation

ciroclover commented Aug 15, 2025

Uh oh!

google-cla bot commented Aug 15, 2025

Uh oh!

m0ps commented Aug 18, 2025

Uh oh!

m0ps commented Aug 22, 2025

Uh oh!

ciroclover commented Sep 8, 2025

Uh oh!

apeabody commented Sep 8, 2025

Uh oh!

mzglinski commented Oct 28, 2025

Uh oh!

apeabody commented Oct 30, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

apeabody commented Oct 30, 2025

Uh oh!

sergeyshevch commented Nov 17, 2025

Uh oh!

apeabody commented Nov 17, 2025

Uh oh!

apeabody commented Nov 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ciroclover commented Nov 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

ciroclover commented Nov 18, 2025

Uh oh!

apeabody commented Nov 18, 2025

Uh oh!

apeabody commented Nov 18, 2025

Uh oh!

ciroclover commented Dec 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!