Skip to content

Conversation

@ciroclover
Copy link

Fixes #2407

The diff shows a massive difference due to the "dynamic" block indentation.

The changes are:

  • Ensure that we define the initial_node_count when is not defined in the node-pool [ref].
  • Make the node-pool block dynamic. Add the conditional to ensure that the block is not generated when node_pools = [] [ref]
  • Fix the cluster_name variables in the main.tf. I honestly believe that we could just use cluster_name_computed = var.name, however, more tests should be needed. [ref]

@ciroclover ciroclover requested review from a team, apeabody and ericyz as code owners August 15, 2025 19:26
@google-cla
Copy link

google-cla bot commented Aug 15, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@m0ps
Copy link
Contributor

m0ps commented Aug 18, 2025

@ciroclover, I think, in such an arrangement, it would never be accepted. First of all, you have to use autogen to generate all submodules automatically. Also, don't forget to generate docs and perform proper linting... I would recommend you make yourself familiar with the contribution guide

@ciroclover ciroclover force-pushed the private-cluster-node-pool branch from 390dcc1 to d985e40 Compare August 20, 2025 13:46
@m0ps
Copy link
Contributor

m0ps commented Aug 22, 2025

@ciroclover

Note: The correct sequence to update the repo using autogen functionality is to run make build. This will create the various Terraform files, and then generate the Terraform documentation using terraform-docs.

ref: https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/blob/main/CONTRIBUTING.md#templating

@ciroclover ciroclover force-pushed the private-cluster-node-pool branch from cbab4aa to 3800ac7 Compare September 8, 2025 13:50
@ciroclover ciroclover closed this Sep 8, 2025
@ciroclover ciroclover force-pushed the private-cluster-node-pool branch from 3800ac7 to eeaf95d Compare September 8, 2025 13:55
@ciroclover ciroclover reopened this Sep 8, 2025
@ciroclover
Copy link
Author

Rebased with the main branch, fixed the template conflict, and ran make build. Any other dependencies before we get this one reviewed?

@apeabody
Copy link
Collaborator

apeabody commented Sep 8, 2025

/gcbrun

@mzglinski
Copy link

Hi, I would also benefit from this PR. Is there anything that is stopping it from further review?

@apeabody
Copy link
Collaborator

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for creating GKE clusters without a default node pool by making the node_pool block dynamic and adjusting related logic. The changes correctly handle the case where no node pools are defined. However, I've identified a critical pre-existing issue in how the inline default node pool is configured, which this PR touches. It incorrectly uses configuration from the first user-defined node pool (var.node_pools[0]), which can lead to misconfigurations. I've provided a detailed comment on this with recommendations for a safer approach. Additionally, I've suggested a simplification for computing the cluster name, which you also hinted at in your description.

@apeabody
Copy link
Collaborator

/gcbrun

@sergeyshevch
Copy link

@apeabody Do you plan to merge this one? We also trying to migrate onto computeclasses and now it's not possible to create empty GKE public cluster

@apeabody
Copy link
Collaborator

/gemini review

@apeabody
Copy link
Collaborator

/gcbrun

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to add support for GKE private clusters without a default node pool. The changes involve making the node_pool block within the google_container_cluster resource dynamic, adjusting initial_node_count logic, and simplifying how the cluster name is derived for outputs. While the changes are mostly correct and improve the module's flexibility, I've found a critical issue in the condition for the dynamic node_pool block. This issue would cause Terraform to fail in scenarios where remove_default_node_pool is set to true while node_pools is not empty. My review includes suggestions to fix this logic across all affected files.

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request successfully implements support for GKE private clusters without a default node pool. The changes correctly make the initial_node_count and the default node_pool block conditional based on whether node pools are explicitly defined or if the default node pool is to be removed. Additionally, the cluster_name_computed logic has been simplified to directly use var.name, which is a more robust approach when the existence of a default node pool is not guaranteed. The changes align well with the stated objective of the pull request.

@ciroclover
Copy link
Author

@apeabody I think it's good to go! 🚀

@apeabody
Copy link
Collaborator

/gcbrun

@apeabody
Copy link
Collaborator

@apeabody I think it's good to go! 🚀

Thanks @ciroclover - From the test:

TestSimpleRegionalClusterAutoscaling 2025-11-18T18:48:37Z retry.go:99: Returning due to fatal error: FatalError{Underlying: error while running command: exit status 1; 
Error: googleapi: Error 400: Cluster.initial_node_count must be greater than zero.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0xc34958235c4e21d9"
  }
]
, badRequest

  with module.example.module.gke.google_container_cluster.primary,
  on ../../../cluster.tf line 22, in resource "google_container_cluster" "primary":
  22: resource "google_container_cluster" "primary" {
}

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for GKE clusters without a default node pool by making the node_pool block dynamic and adjusting the initial_node_count logic. It also includes a good fix for how cluster_name_computed is derived, making it more robust. The changes are well-aligned with the goal of the PR. I've added a couple of suggestions to improve the readability of the new conditional logic, which is a bit complex.

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request aims to add support for GKE clusters without a default node pool. The simplification of the cluster_name_computed local is a good improvement. However, I've identified two significant issues with the implementation. First, there's a critical logic flaw in how initial_node_count is calculated, which can lead to invalid API requests when remove_default_node_pool is true. Second, the configuration of the default node pool is incorrectly tied to the first user-defined node pool, which is a major design flaw that can cause unexpected behavior. I've provided detailed comments and suggestions to address these issues.

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for GKE clusters without a default node pool, which is a valuable enhancement for flexibility. The changes are well-implemented across the various module variants. The core logic involves making the default node_pool block dynamic and conditionally setting initial_node_count. Additionally, the simplification of cluster_name_computed to remove its dependency on node pools is a solid improvement that increases the module's robustness. My review includes a few suggestions to improve the readability of some conditional expressions, but overall, the changes are correct and well-executed.

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces support for GKE private clusters without a default node pool, dynamically managing the default node pool block, and simplifying the cluster_name_computed variable. The changes aim to provide more flexibility in cluster configuration, particularly for scenarios where a default node pool is not desired or custom node pools are exclusively used. However, there is a logical issue in the conditional rendering of the dynamic "node_pool" block that could lead to unintended resource creation.

}

{% if autopilot_cluster != true %}
initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 649-767) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

  initial_node_count  = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

Comment on lines 649 to 650
dynamic "node_pool" {
for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 48, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

}
}

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

cluster.tf Outdated
Comment on lines 506 to 507
dynamic "node_pool" {
for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

}
}

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 543-660) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

Comment on lines 543 to 544
dynamic "node_pool" {
for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

}
}

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

Comment on lines 506 to 507
dynamic "node_pool" {
for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

}
}

initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The condition for initial_node_count is currently var.remove_default_node_pool || length(var.node_pools) > 0 ? null : max(1, var.initial_node_count). This correctly sets initial_node_count to null if the default node pool is to be removed or if custom node pools are defined. However, if the dynamic "node_pool" block (lines 506-615) is intended to explicitly create the default node pool when remove_default_node_pool is false and length(var.node_pools) == 0, then this initial_node_count should also be null in that specific case to avoid conflicts or double creation. Otherwise, the implicit default node pool will still be created by GKE, potentially conflicting with the explicit dynamic "node_pool" block if its condition is also met.

To ensure the dynamic "node_pool" block is the sole mechanism for creating a default node pool (when not explicitly removed and no custom node pools are defined), this line should be adjusted to always set initial_node_count to null if a default node pool is not desired implicitly, or if the explicit dynamic "node_pool" block is meant to handle it.

  initial_node_count = var.remove_default_node_pool || length(var.node_pools) > 0 || length(var.node_pools) == 0 ? null : max(1, var.initial_node_count)

Comment on lines 506 to 507
dynamic "node_pool" {
for_each = var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The for_each condition for the dynamic "node_pool" block is currently var.remove_default_node_pool || length(var.node_pools) == 0 ? [] : [1]. This logic is problematic when var.remove_default_node_pool is false and length(var.node_pools) > 0.

In this scenario, the condition evaluates to false || false ? [] : [1], which results in [1]. This means an explicit default node pool is created even when custom node pools are defined in var.node_pools. This is likely unintended, as typically, if custom node pools are defined, an additional default node pool should not be created unless one of the custom node pools is explicitly named "default-pool".

To align with the objective of supporting clusters without a default node pool and to prevent unintended resource creation, the for_each condition should be adjusted to only create the explicit default node pool if var.remove_default_node_pool is false AND length(var.node_pools) is 0.

This change, combined with the suggested adjustment to initial_node_count on line 43, ensures that the dynamic "node_pool" block correctly handles the explicit creation of a default node pool only when necessary, and avoids conflicts with implicit default node pool creation or unintended additional node pools when custom ones are defined.

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for creating GKE clusters without a default node pool by making the node_pool block dynamic. It also simplifies how the cluster_name_computed local is derived. The changes are generally in the right direction.

However, I've identified a critical issue in the implementation of the dynamic default node pool. The configuration for this pool incorrectly attempts to access var.node_pools[0], which is guaranteed to be empty when this block is created, and will cause Terraform to fail. Additionally, the initial_node_count for this default pool is not correctly handled and could lead to an invalid configuration.

I've added detailed comments on all affected cluster.tf files with suggestions for fixing these issues. Please address these points to ensure the module works as expected.

@ciroclover
Copy link
Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for GKE clusters without a default node pool by making the node_pool block dynamic and adjusting related logic. This is a great feature enhancement. However, I've found a critical issue in the implementation of the dynamic default node pool that will cause Terraform to fail when no node pools are specified. The configuration for the default node pool incorrectly attempts to access var.node_pools[0], which is not available in that context. I've provided a detailed comment with a code suggestion to fix this issue by using default values and also to address an inconsistency in the naming of the default node pool. Please review the suggested changes carefully.

Comment on lines +649 to 768
dynamic "node_pool" {
for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])
content {
name = "default-pool"
initial_node_count = max(1, var.initial_node_count)

management {
auto_repair = lookup(var.cluster_autoscaling, "auto_repair", true)
auto_upgrade = lookup(var.cluster_autoscaling, "auto_upgrade", true)
}

dynamic "gvnic" {
for_each = lookup(var.node_pools[0], "enable_gvnic", false) ? [true] : []
content {
enabled = gvnic.value
node_config {
image_type = lookup(var.node_pools[0], "image_type", "COS_CONTAINERD")
machine_type = lookup(var.node_pools[0], "machine_type", "e2-medium")
min_cpu_platform = lookup(var.node_pools[0], "min_cpu_platform", "")
enable_confidential_storage = lookup(var.node_pools[0], "enable_confidential_storage", false)
disk_type = lookup(var.node_pools[0], "disk_type", null)
dynamic "gcfs_config" {
for_each = lookup(var.node_pools[0], "enable_gcfs", null) != null ? [var.node_pools[0].enable_gcfs] : []
content {
enabled = gcfs_config.value
}
}
}

dynamic "fast_socket" {
for_each = lookup(var.node_pools[0], "enable_fast_socket", null) != null ? [var.node_pools[0].enable_fast_socket] : []
content {
enabled = fast_socket.value
dynamic "gvnic" {
for_each = lookup(var.node_pools[0], "enable_gvnic", false) ? [true] : []
content {
enabled = gvnic.value
}
}
}

dynamic "kubelet_config" {
for_each = length(setintersection(
keys(var.node_pools[0]),
["cpu_manager_policy", "cpu_cfs_quota", "cpu_cfs_quota_period", "insecure_kubelet_readonly_port_enabled", "pod_pids_limit", "container_log_max_size", "container_log_max_files", "image_gc_low_threshold_percent", "image_gc_high_threshold_percent", "image_minimum_gc_age", "image_maximum_gc_age", "allowed_unsafe_sysctls"]
)) != 0 || var.insecure_kubelet_readonly_port_enabled != null ? [1] : []
dynamic "fast_socket" {
for_each = lookup(var.node_pools[0], "enable_fast_socket", null) != null ? [var.node_pools[0].enable_fast_socket] : []
content {
enabled = fast_socket.value
}
}

content {
cpu_manager_policy = lookup(var.node_pools[0], "cpu_manager_policy", "static")
cpu_cfs_quota = lookup(var.node_pools[0], "cpu_cfs_quota", null)
cpu_cfs_quota_period = lookup(var.node_pools[0], "cpu_cfs_quota_period", null)
insecure_kubelet_readonly_port_enabled = lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled) != null ? upper(tostring(lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled))) : null
pod_pids_limit = lookup(var.node_pools[0], "pod_pids_limit", null)
container_log_max_size = lookup(var.node_pools[0], "container_log_max_size", null)
container_log_max_files = lookup(var.node_pools[0], "container_log_max_files", null)
image_gc_low_threshold_percent = lookup(var.node_pools[0], "image_gc_low_threshold_percent", null)
image_gc_high_threshold_percent = lookup(var.node_pools[0], "image_gc_high_threshold_percent", null)
image_minimum_gc_age = lookup(var.node_pools[0], "image_minimum_gc_age", null)
image_maximum_gc_age = lookup(var.node_pools[0], "image_maximum_gc_age", null)
allowed_unsafe_sysctls = lookup(var.node_pools[0], "allowed_unsafe_sysctls", null) == null ? null : [for s in split(",", lookup(var.node_pools[0], "allowed_unsafe_sysctls", null)) : trimspace(s)]
dynamic "kubelet_config" {
for_each = length(setintersection(
keys(var.node_pools[0]),
["cpu_manager_policy", "cpu_cfs_quota", "cpu_cfs_quota_period", "insecure_kubelet_readonly_port_enabled", "pod_pids_limit", "container_log_max_size", "container_log_max_files", "image_gc_low_threshold_percent", "image_gc_high_threshold_percent", "image_minimum_gc_age", "image_maximum_gc_age", "allowed_unsafe_sysctls"]
)) != 0 || var.insecure_kubelet_readonly_port_enabled != null ? [1] : []

content {
cpu_manager_policy = lookup(var.node_pools[0], "cpu_manager_policy", "static")
cpu_cfs_quota = lookup(var.node_pools[0], "cpu_cfs_quota", null)
cpu_cfs_quota_period = lookup(var.node_pools[0], "cpu_cfs_quota_period", null)
insecure_kubelet_readonly_port_enabled = lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled) != null ? upper(tostring(lookup(var.node_pools[0], "insecure_kubelet_readonly_port_enabled", var.insecure_kubelet_readonly_port_enabled))) : null
pod_pids_limit = lookup(var.node_pools[0], "pod_pids_limit", null)
container_log_max_size = lookup(var.node_pools[0], "container_log_max_size", null)
container_log_max_files = lookup(var.node_pools[0], "container_log_max_files", null)
image_gc_low_threshold_percent = lookup(var.node_pools[0], "image_gc_low_threshold_percent", null)
image_gc_high_threshold_percent = lookup(var.node_pools[0], "image_gc_high_threshold_percent", null)
image_minimum_gc_age = lookup(var.node_pools[0], "image_minimum_gc_age", null)
image_maximum_gc_age = lookup(var.node_pools[0], "image_maximum_gc_age", null)
allowed_unsafe_sysctls = lookup(var.node_pools[0], "allowed_unsafe_sysctls", null) == null ? null : [for s in split(",", lookup(var.node_pools[0], "allowed_unsafe_sysctls", null)) : trimspace(s)]
}
}
}

dynamic "sole_tenant_config" {
# node_affinity is currently the only member of sole_tenant_config
for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [true] : []
content {
dynamic "node_affinity" {
for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [lookup(var.node_pools[0], "node_affinity", null)] : []
content {
key = lookup(jsondecode(node_affinity.value), "key", null)
operator = lookup(jsondecode(node_affinity.value), "operator", null)
values = lookup(jsondecode(node_affinity.value), "values", [])
dynamic "sole_tenant_config" {
# node_affinity is currently the only member of sole_tenant_config
for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [true] : []
content {
dynamic "node_affinity" {
for_each = lookup(var.node_pools[0], "node_affinity", null) != null ? [lookup(var.node_pools[0], "node_affinity", null)] : []
content {
key = lookup(jsondecode(node_affinity.value), "key", null)
operator = lookup(jsondecode(node_affinity.value), "operator", null)
values = lookup(jsondecode(node_affinity.value), "values", [])
}
}
}
}
}

service_account = lookup(var.node_pools[0], "service_account", local.service_account)
service_account = lookup(var.node_pools[0], "service_account", local.service_account)

tags = concat(
lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [],
lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-pool"] : [],
lookup(local.node_pools_tags, "all", []),
lookup(local.node_pools_tags, var.node_pools[0].name, []),
)
tags = concat(
lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [],
lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-pool"] : [],
lookup(local.node_pools_tags, "all", []),
lookup(local.node_pools_tags, var.node_pools[0].name, []),
)

logging_variant = lookup(var.node_pools[0], "logging_variant", "DEFAULT")
logging_variant = lookup(var.node_pools[0], "logging_variant", "DEFAULT")

dynamic "workload_metadata_config" {
for_each = local.cluster_node_metadata_config
dynamic "workload_metadata_config" {
for_each = local.cluster_node_metadata_config

content {
mode = workload_metadata_config.value.mode
content {
mode = workload_metadata_config.value.mode
}
}
}

metadata = local.node_pools_metadata["all"]
metadata = local.node_pools_metadata["all"]

{% if beta_cluster %}
dynamic "sandbox_config" {
for_each = tobool((lookup(var.node_pools[0], "sandbox_enabled", var.sandbox_enabled))) ? ["gvisor"] : []
content {
sandbox_type = sandbox_config.value
{% if beta_cluster %}
dynamic "sandbox_config" {
for_each = tobool((lookup(var.node_pools[0], "sandbox_enabled", var.sandbox_enabled))) ? ["gvisor"] : []
content {
sandbox_type = sandbox_config.value
}
}
}

{% endif %}
boot_disk_kms_key = lookup(var.node_pools[0], "boot_disk_kms_key", var.boot_disk_kms_key)
{% endif %}
boot_disk_kms_key = lookup(var.node_pools[0], "boot_disk_kms_key", var.boot_disk_kms_key)

storage_pools = lookup(var.node_pools[0], "storage_pools", null) != null ? [var.node_pools[0].storage_pools] : []
storage_pools = lookup(var.node_pools[0], "storage_pools", null) != null ? [var.node_pools[0].storage_pools] : []

shielded_instance_config {
enable_secure_boot = lookup(var.node_pools[0], "enable_secure_boot", false)
enable_integrity_monitoring = lookup(var.node_pools[0], "enable_integrity_monitoring", true)
}
shielded_instance_config {
enable_secure_boot = lookup(var.node_pools[0], "enable_secure_boot", false)
enable_integrity_monitoring = lookup(var.node_pools[0], "enable_integrity_monitoring", true)
}

local_ssd_encryption_mode = lookup(var.node_pools[0], "local_ssd_encryption_mode", null)
max_run_duration = lookup(var.node_pools[0], "max_run_duration", null)
flex_start = lookup(var.node_pools[0], "flex_start", null)
local_ssd_encryption_mode = lookup(var.node_pools[0], "local_ssd_encryption_mode", null)
max_run_duration = lookup(var.node_pools[0], "max_run_duration", null)
flex_start = lookup(var.node_pools[0], "flex_start", null)
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This dynamic node_pool block has two issues:

  1. Critical Bug: The configuration for this default node pool is created only when length(var.node_pools) == 0, but it incorrectly tries to access var.node_pools[0] for many of its attributes. This will cause a fatal error during terraform plan or apply. The configuration for this default node pool should use default values instead of trying to look them up from an empty list.
  2. Inconsistency: The name of this default node pool is hardcoded to default-pool, while other parts of the module (like local.node_pools_tags) expect default-node-pool. This will lead to incorrect configurations (e.g., tags not being applied).

I've provided a suggestion that fixes both issues by correcting the name and removing the invalid references to var.node_pools[0].

  dynamic "node_pool" {
    for_each = var.remove_default_node_pool ? [] : (length(var.node_pools) == 0 ? [1] : [])
    content {
      name               = "default-node-pool"
      initial_node_count = max(1, var.initial_node_count)

      management {
        auto_repair  = lookup(var.cluster_autoscaling, "auto_repair", true)
        auto_upgrade = lookup(var.cluster_autoscaling, "auto_upgrade", true)
      }

      node_config {
        image_type                  = "COS_CONTAINERD"
        machine_type                = "e2-medium"
        min_cpu_platform            = ""
        enable_confidential_storage = false
        disk_type                   = null
        dynamic "gcfs_config" {
          for_each = []
          content {
            enabled = gcfs_config.value
          }
        }

        dynamic "gvnic" {
          for_each = []
          content {
            enabled = gvnic.value
          }
        }

        dynamic "fast_socket" {
          for_each = []
          content {
            enabled = fast_socket.value
          }
        }

        dynamic "kubelet_config" {
          for_each = var.insecure_kubelet_readonly_port_enabled != null ? [1] : []

          content {
            cpu_manager_policy                     = "static"
            cpu_cfs_quota                          = null
            cpu_cfs_quota_period                   = null
            insecure_kubelet_readonly_port_enabled = var.insecure_kubelet_readonly_port_enabled != null ? upper(tostring(var.insecure_kubelet_readonly_port_enabled)) : null
            pod_pids_limit                         = null
            container_log_max_size                 = null
            container_log_max_files                = null
            image_gc_low_threshold_percent         = null
            image_gc_high_threshold_percent        = null
            image_minimum_gc_age                   = null
            image_maximum_gc_age                   = null
            allowed_unsafe_sysctls                 = null
          }
        }

        dynamic "sole_tenant_config" {
          # node_affinity is currently the only member of sole_tenant_config
          for_each = []
          content {
            dynamic "node_affinity" {
              for_each = []
              content {
                key      = null
                operator = null
                values   = []
              }
            }
          }
        }

        service_account = local.service_account

        tags = concat(
          lookup(local.node_pools_tags, "default_values", [true, true])[0] ? [local.cluster_network_tag] : [],
          lookup(local.node_pools_tags, "default_values", [true, true])[1] ? ["${local.cluster_network_tag}-default-node-pool"] : [],
          lookup(local.node_pools_tags, "all", []),
          lookup(local.node_pools_tags, "default-node-pool", []),
        )

        logging_variant = "DEFAULT"

        dynamic "workload_metadata_config" {
          for_each = local.cluster_node_metadata_config

          content {
            mode = workload_metadata_config.value.mode
          }
        }

        metadata = local.node_pools_metadata["all"]

        {% if beta_cluster %}
        dynamic "sandbox_config" {
          for_each = tobool(var.sandbox_enabled) ? ["gvisor"] : []
          content {
            sandbox_type = sandbox_config.value
          }
        }

        {% endif %}
        boot_disk_kms_key = var.boot_disk_kms_key

        storage_pools = null

        shielded_instance_config {
          enable_secure_boot          = false
          enable_integrity_monitoring = true
        }

        local_ssd_encryption_mode = null
        max_run_duration          = null
        flex_start                = null
      }
    }
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for GKE clusters with no default node pools

5 participants