Skip to content

Commit ff35be6

Browse files
authored
Merge pull request #103581 from mburke5678/autoscale-cordon-terminate
OSDOCS 15817 Support cordon-node-before-terminating in cluster-autoscaler
2 parents d5b74a9 + 324aea5 commit ff35be6

File tree

2 files changed

+12
-1
lines changed

2 files changed

+12
-1
lines changed

modules/cluster-autoscaler-about.adoc

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ endif::openshift-rosa-hcp[]
3939
ifndef::openshift-rosa-hcp[]
4040
[IMPORTANT]
4141
====
42-
Ensure that the `maxNodesTotal` value in the `ClusterAutoscaler` resource definition that you create is large enough to account for the total possible number of machines in your cluster. This value must encompass the number of control plane machines and the possible number of compute machines that you might scale to.
42+
Ensure that the `maxNodesTotal` value in the `ClusterAutoscaler` custom resource (CR) that you create is large enough to account for the total possible number of machines in your cluster. This value must encompass the number of control plane machines and the possible number of compute machines that you might scale to.
4343
====
4444
endif::openshift-rosa-hcp[]
4545

@@ -65,6 +65,10 @@ If the following types of pods are present on a node, the cluster autoscaler wil
6565

6666
For example, you set the maximum CPU limit to 64 cores and configure the cluster autoscaler to only create machines that have 8 cores each. If your cluster starts with 30 cores, the cluster autoscaler can add up to 4 more nodes with 32 cores, for a total of 62.
6767

68+
[NOTE]
69+
====
70+
By default, when the cluster autoscaler removes a node, it does not cordon the node when draining the pods from the node. You can configure the cluster autoscaler to cordon the node before draining and moving the pods by setting the `spec.scaleDown.cordonNodeBeforeTerminating` parameter to `enabled` in the `ClusterAutoscaler` CR. This parameter is disabled by default. It is recommended to enable this parameter in production clusters because of the risk of data loss, application errors, pods getting stuck in the terminating state, or other issues if the cluster autoscaler removes a node when the parameter is disabled. Leaving this parameter disabled, which can result in faster node removal, might be appropriate in clusters that run only stateless workloads.
71+
====
6872

6973
[id="cluster-autoscaler-limitations_{context}"]
7074
== Limitations

modules/cluster-autoscaler-cr.adoc

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ spec:
3737
max: 16
3838
logVerbosity: 4
3939
scaleDown:
40+
cordonNodeBeforeTerminating: Enabled
4041
enabled: true
4142
delayAfterAdd: 10m
4243
delayAfterDelete: 5m
@@ -95,6 +96,12 @@ If you do not specify a value, the default value of `1` is used.
9596
|`scaleDown`
9697
|In this section, you can specify the period to wait for each action by using any valid link:https://golang.org/pkg/time/#ParseDuration[ParseDuration] interval, including `ns`, `us`, `ms`, `s`, `m`, and `h`.
9798

99+
|`scaleDown.cordonNodeBeforeTerminating`
100+
a|Optional: Specify whether the cluster autoscaler should cordon a node before removing that node by using one of the following values:
101+
102+
* `Enabled`: The cluster autoscaler cordons the node before draining any pods and removing that node.
103+
* `Disabled`: The cluster autoscaler does not cordon the node before draining any pods and removing that node. This is the default.
104+
98105
|`scaleDown.enabled`
99106
|Specify whether the cluster autoscaler can remove unnecessary nodes.
100107

0 commit comments

Comments
 (0)