From 53c82e35f754e4af8241593045f433dc98c4b146 Mon Sep 17 00:00:00 2001 From: Laura Hinson Date: Tue, 16 Jun 2026 10:32:46 -0400 Subject: [PATCH] [OSDOCS-20232]: File cleanup for HCP --- hosted_control_planes/hcp-release-notes.adoc | 2 +- .../hosted-control-planes-release-notes.adoc | 178 ------------------ modules/hcp-release-notes-fixed-issues.adoc | 52 ----- modules/hcp-release-notes-new-features.adoc | 19 -- .../hcp-release-notes-notable-changes.adoc | 3 - snippets/hcp-snippet.adoc | 4 +- 6 files changed, 3 insertions(+), 255 deletions(-) delete mode 100644 hosted_control_planes/hosted-control-planes-release-notes.adoc diff --git a/hosted_control_planes/hcp-release-notes.adoc b/hosted_control_planes/hcp-release-notes.adoc index 093aec7b9595..27dabe1a72dc 100644 --- a/hosted_control_planes/hcp-release-notes.adoc +++ b/hosted_control_planes/hcp-release-notes.adoc @@ -6,7 +6,7 @@ include::_attributes/common-attributes.adoc[] toc::[] -With this release, {hcp} for {product-title} 4.22 is available. {hcp-capital} for {product-title} 4.22 supports {mce} version 2.17. +With this release, {hcp} for {product-title} {product-version} is available. {hcp-capital} for {product-title} {product-version} supports {mce} version {product-version}. // New features and enhancements include::modules/hcp-release-notes-new-features.adoc[leveloffset=+1] diff --git a/hosted_control_planes/hosted-control-planes-release-notes.adoc b/hosted_control_planes/hosted-control-planes-release-notes.adoc deleted file mode 100644 index 6e30c2a2b6f3..000000000000 --- a/hosted_control_planes/hosted-control-planes-release-notes.adoc +++ /dev/null @@ -1,178 +0,0 @@ -:_mod-docs-content-type: ASSEMBLY -[id="hosted-control-planes-release-notes"] -include::_attributes/common-attributes.adoc[] -= {hcp-capital} release notes -:context: hosted-control-planes-release-notes - -toc::[] - -Release notes contain information about new and deprecated features, changes, and known issues. - -[id="hcp-4-20-release-notes_{context}"] -== {hcp-capital} release notes for {product-title} 4.20 - -With this release, {hcp} for {product-title} 4.20 is available. {hcp-capital} for {product-title} 4.20 supports {mce} version 2.10. - -[id="hcp-4-20-new-features-and-enhancements_{context}"] -=== New features and enhancements - -[id="hcp-4-20-scale-up-only_{context}"] -==== Scaling up workloads in a hosted cluster - -You can now only scale up workloads by using the `ScaleUpOnly` behavior, without scaling down the workloads in your hosted cluster. For more information, see xref:../hosted_control_planes/hcp-machine-config.adoc#scale-up-autoscaler-hcp_hcp-machine-config[Scaling up workloads in a hosted cluster]. - -[id="hcp-4-20-scale-up-down_{context}"] -==== Scaling up and down workloads in a hosted cluster - -You can now scale up and down the workloads by using the `ScaleUpAndScaleDown` behavior in your hosted cluster. For more information, see xref:../hosted_control_planes/hcp-machine-config.adoc#scale-up-down-autoscaler-hcp_hcp-machine-config[Scaling up and down workloads in a hosted cluster]. - -[id="hcp-4-20-balance-ignored-labels_{context}"] -==== Balancing ignored labels in a hosted cluster - -After scaling up your node pools, you can now set `balancingIgnoredLabels` to evenly distribute the machines across node pools. For more information, see xref:../hosted_control_planes/hcp-machine-config.adoc#balance-ignored-labels-autoscaler-hcp_hcp-machine-config[Balancing ignored labels in a hosted cluster]. - -[id="hcp-4-20-priority-expander_{context}"] -==== Setting the priority expander in a hosted cluster - -You can now create high priority machines before low priority machines by configuring the priority expander in your hosted cluster. For more information, see xref:../hosted_control_planes/hcp-machine-config.adoc#priority-expander-autoscaler-hcp_hcp-machine-config[Setting the priority expander in a hosted cluster]. - -[id="hcp-4-20-ibm-z-disconnected_{context}"] -==== {hcp-capital} on {ibm-z-title} in a disconnected environment is Generally Available - -As of this release, {hcp} on {ibm-z-title} in a disconnected environment is a General Availablilty feature. For more information, see xref:../hosted_control_planes/hcp-disconnected/disconnected-install-ibmz-hcp.adoc[Deploying {hcp} on {ibm-z-title} in a disconnected environment]. - -//[id="hcp-4-20-internal-subnets-hcp_{context}"] -//==== Configuring internal subnets for hosted clusters - -//In hosted clusters, you can configure internal IPv4 subnets that the OVN-Kubernetes network plugin uses to provide flexibility and avoid classless inter-domain routing (CIDR) conflicts. For more information, see xref:../hosted_control_planes/hcp-networking.adoc#hcp-custom-ovn-subnets_hcp-networking[Configuring internal OVN IPv4 subnets for hosted clusters]. - -[id="bug-fixes-hcp-rn-4-20_{context}"] -=== Bug fixes - -* Before this update, the SAN validation for custom certificates in `hc.spec.configuration.apiServer.servingCerts.namedCertificates` did not properly handle wildcard DNS patterns, such as `\*.example.com`. As a consequence, the wildcard DNS patterns in custom certificates could conflict with internal Kubernetes API server certificate SANs without being detected, leading to certificate validation failures and potential deployment issues. This release provides enhanced DNS SAN conflict detection to include RFC-compliant wildcard support, implementing bidirectional conflict validation that properly handles wildcard patterns such as `*.example.com` matching `sub.example.com`. As a result, wildcard DNS patterns are now properly validated, preventing certificate conflicts and ensuring more reliable hosted cluster deployments with wildcard certificate support. (link:https://issues.redhat.com/browse/OCPBUGS-60381[OCPBUGS-60381]) - -* Before this update, the Azure cloud provider did not set the default ping target, `HTTP:10256/healthz`, for the Azure load balancer. Instead, services of the `LoadBalancer` type that ran on Azure had a ping target of `TCP:30810`. As a consequence, the health probes for cluster-wide services were non-functional, and during upgrades, they experienced downtime. With this release, the `ClusterServiceLoadBalancerHealthProbeMode` property of the cloud configuration is set to `shared`. As a result, load balancers in Azure have the correct health check ping target, `HTTP:10256/healthz`, which points to `kube-proxy` health endpoints that run on nodes. (link:https://issues.redhat.com/browse/OCPBUGS-58031[OCPBUGS-58031]) - -* Before this update, the HyperShift Operator failed to clear the `user-ca-bundle` config map after the removal of the `additionalTrustBundle` parameter from the `HostedCluster` resource. As a consequence, the `user-ca-bundle` config map was not updated, resulting in failure to generate ignition payloads. With this release, the HyperShift Operator actively removes the `user-ca-bundle` config map from the control plane namespace when it is removed from the `HostedCluster` resource. As a result, the `user-ca-bundle` config map is now correctly cleared, enabling the generation of ignition payloads. (link:https://issues.redhat.com/browse/OCPBUGS-57336[OCPBUGS-57336]) - -* Before this update, if you tried to create a hosted cluster on AWS when the Kubernetes API server service publishing strategy was `LoadBalancer` with `PublicAndPrivate` endpoint access, a private router admitted the OAuth route even though the External DNS Operator did not register a DNS record. As a consequence, the private router did not properly resolve the route URL and the OAuth server was inaccessible. The Console Cluster Operator also failed to start, and the hosted cluster installation failed. With this release, a private router admits the OAuth route only when the external DNS is defined. Otherwise, the router admits the route in the management cluster. As a result, the OAuth route is accessible, the Console Cluster Operator properly starts, and the hosted cluster installation succeeds. (link:https://issues.redhat.com/browse/OCPBUGS-56914[OCPBUGS-56914]) - -* Before this release, when an IDMS or ICSP in the management OpenShift cluster defined a source that pointed to registry.redhat.io or registry.redhat.io/redhat, and the mirror registry did not contain the required OLM catalog images, provisioning for the `HostedCluster` resource stalled due to unauthorized image pulls. As a consequence, the `HostedCluster` resource was not deployed, and it remained blocked, where it could not pull essential catalog images from the mirrored registry. With this release, if a required image cannot be pulled due to authorization errors, the provisioning now explicitly fails. The logic for registry override is improved to allow matches on the root of the registry, such as registry.redhat.io, for OLM CatalogSource image resolution. A fallback mechanism is also introduced to use the original `ImageReference` if the registry override does not yield a working image. As a result, the `HostedCluster` resource can be deployed successfully, even in scenarios where the mirror registry lacks the required OLM catalog images, as the system correctly falls back to pulling from the original source when appropriate. (link:https://issues.redhat.com/browse/OCPBUGS-56492[OCPBUGS-56492]) - -* Before this update, the AWS Cloud Provider did not set the default ping target, `HTTP:10256/healthz`, for the AWS load balancer. For services of the `LoadBalancer` type that run on AWS, the load balancer object created in AWS had a ping target of `TCP:32518`. As a consequence, the health probes for cluster-wide services were non-functional, and during upgrades, those services were down. With this release, the `ClusterServiceLoadBalancerHealthProbeMode` property of the cloud configuration is set to `Shared`. This cloud configuration is passed to the AWS Cloud Provider. As a result, the load balancers in AWS have the correct health check ping target, `HTTP:10256/healthz`, which points to the `kube-proxy` health endpoints that are running on nodes. (link:https://issues.redhat.com/browse/OCPBUGS-56011[OCPBUGS-56011]) - -* Before this update, when you disabled the image registry capability by using the `--disable-cluster-capabilities` option, {hcp} still required you to configure a managed identity for the image registry. In this release, when the image registry is disabled, the image registry managed identity configuration is optional. (link:https://issues.redhat.com/browse/OCPBUGS-55892[OCPBUGS-55892]) - -* Before this update, the `ImageDigestMirrorSet` (IDMS) and `ImageContentSourcePolicy` (ICSP) resources from the management cluster were processed without considering that someone might specify only the root registry name as a mirror or source for image replacement. As a consequence, the IDMS and ICSP entries that used only the root registry name did not work as expected. In this release, the mirror replacement logic now correctly handles cases where only the root registry name is provided. As a result, the issue no longer occurs, and root registry mirror replacements are now supported. (link:https://issues.redhat.com/browse/OCPBUGS-54483[OCPBUGS-54483]) - -* Before this update, {hcp} did not correctly persist registry metadata and release image provider caches in the `HostedCluster` resource. As a consequence, caches for release and image metadata reset on `HostedCluster` controller reconciliation. This release introduces a common registry provider which is used by the `HostedCluster` resource to fix cache loss. This reduces the number of image pulls and network traffic, thus improving overall performance. (link:https://issues.redhat.com/browse/OCPBUGS-53259[OCPBUGS-53259]) - -* Before this update, when you configured an OIDC provider for a `HostedCluster` resource with an OIDC client that did not specify a client secret, the system automatically generated a default secret name. As a consequence, you could not configure OIDC public clients, which are not supposed to use secrets. This release fixes the issue. If no client secret is provided, no default secret name is generated, enabling proper support for public clients. (link:https://issues.redhat.com/browse/OCPBUGS-58149[OCPBUGS-58149]) - -* Before this update, multiple mirror images caused a hosted control plane payload error due to failed image lookup. As a consequence, users could not create hosted clusters. With this release, the hosted control plane payload now supports multiple mirrors, avoiding errors when a primary mirror is unavailable. As a result, users can create hosted clusters. (link:https://issues.redhat.com/browse/OCPBUGS-54720[OCPBUGS-54720]) - -* Before this update, when a hosted cluster was upgraded to multiple versions over time, the version history in the `HostedCluster` resource sometimes exceeded 10 entries. However, the API had a strict validation limit of 10 items maximum for the version history field. As a consequence, users could not edit or update their `HostedCluster` resources when the version history exceeded 10 entries. Operations such as adding annotations (for example, for cluster size overrides) or performing maintenance tasks like resizing request serving nodes failed with a validation error: "status.version.history: Too many: 11: must have at most 10 items". This error prevented ROSA SREs from performing critical maintenance operations that might impact customer API access. -+ -With this release, the maximum items validation constraint has been removed from the version history field in the `HostedCluster` API, allowing the history to grow beyond 10 entries without triggering validation errors. As a result, `HostedCluster` resources can now be edited and updated regardless of how many entries exist in the version history, so that administrators can perform necessary maintenance operations on clusters that have undergone multiple version upgrades. (link:https://issues.redhat.com/browse/OCPBUGS-58200[OCPBUGS-58200]) - -* Before this update, following a CLI refactoring, the `MarkPersistentFlagRequired` function stopped working correctly. The `--name` and `--pull-secret` flags, which are critical for cluster creation, were marked as required, but the validation was not being enforced. As a consequence, users could run the `hypershift create cluster` command without providing the required `--name` or `--pull-secret` flags, and the CLI would not immediately alert them that these required flags were missing. This could lead to misconfigured deployments and confusing error messages later in the process. -+ -This release adds an explicit validation in the `RawCreateOptions.Validate()` function to check for the presence of the `--name` and `--pull-secret` flags, returning clear error messages when either flag is missing. Additionally, the default "example" value is removed from the name field to ensure proper validation. As a result, when users attempt to create a cluster without the required `--name` or `--pull-secret` flags, they now receive immediate, clear error messages indicating which required flag is missing (for example, "Error: --name is required" or "Error: --pull-secret is required"), preventing misconfigured deployments and improving the user experience. (link:https://issues.redhat.com/browse/OCPBUGS-37323[OCPBUGS-37323]) - -* Before this update, a variable shadowing bug in the `GetSupportedOCPVersions()` function caused the `supportedVersions` variable to be incorrectly assigned using `:=` instead of `=`, creating a local variable that was immediately discarded rather than updating the intended outer scope variable. As a consequence, when users ran the `hypershift version` command with the HyperShift Operator deployed, the CLI would either display `` for the Server Version or panic with a "nil pointer dereference" error, preventing users from verifying the deployed HyperShift Operator version. -+ -This release corrects the variable assignment from `supportedVersions :=` to `supportedVersions =` in the `GetSupportedOCPVersions()` function to properly assign the config map to the outer scope variable, ensuring the supported versions data is correctly populated. As a result, the `hypershift version` command now correctly displays the Server Version (for example, "Server Version: f001510b35842df352d1ab55d961be3fdc2dae32") when the HyperShift Operator is deployed, so that users can verify the running operator version and supported {product-title} versions. (link:https://issues.redhat.com/browse/OCPBUGS-57316[OCPBUGS-57316]) - -* Before this update, the HyperShift Operator validated the Kubernetes API Server subject alternative names (SANs) in all cases. As a consequence, users sometimes experienced invalid API Server SANs during public key infrastructure (PKI) reconciliation. With this release, the Kubernetes API Server SANs are validated only if PKI reconciliation is not disabled. (link:https://issues.redhat.com/browse/OCPBUGS-56457[OCPBUGS-56457]) - -* Before this update, the shared ingress controller did not handle the `HostedCluster.Spec.KubeAPIServerDNSName` field, so custom kube-apiserver DNS names were not added to the router configuration. As a consequence, traffic destined for the kube-apiserver on a hosted control plane that used a custom DNS name (via `HostedCluster.Spec.KubeAPIServerDNSName`) was not routed correctly, preventing the `KubeAPIExternalName` feature from working with platforms that use shared ingress. -+ -This release adds handling for `HostedCluster.Spec.KubeAPIServerDNSName` in the shared ingress controller. When a hosted cluster specifies a custom kube-apiserver DNS name, the controller now automatically creates a route that directs traffic to the kube-apiserver service. As a result, traffic destined for custom kube-apiserver DNS names is now correctly routed by the shared ingress controller, enabling the `KubeAPIExternalName` feature to work on platforms that use shared ingress. (link:https://issues.redhat.com/browse/OCPBUGS-57790[OCPBUGS-57790]) - -[id="known-issues-hcp-rn-4-20_{context}"] -=== Known issues - -* If the annotation and the `ManagedCluster` resource name do not match, the {mce} console displays the cluster as `Pending import`. The cluster cannot be used by the {mce-short}. The same issue happens when there is no annotation and the `ManagedCluster` name does not match the `Infra-ID` value of the `HostedCluster` resource. - -* When you use the {mce} console to add a new node pool to an existing hosted cluster, the same version of {product-title} might appear more than once in the list of options. You can select any instance in the list for the version that you want. - -* When a node pool is scaled down to 0 workers, the list of hosts in the console still shows nodes in a `Ready` state. You can verify the number of nodes in two ways: - -** In the console, go to the node pool and verify that it has 0 nodes. -** On the command-line interface, run the following commands: - -*** Verify that 0 nodes are in the node pool by running the following command: -+ -[source,terminal] ----- -$ oc get nodepool -A ----- - -*** Verify that 0 nodes are in the cluster by running the following command: -+ -[source,terminal] ----- -$ oc get nodes --kubeconfig ----- - -*** Verify that 0 agents are reported as bound to the cluster by running the following command: -+ -[source,terminal] ----- -$ oc get agents -A ----- - -* When you create a hosted cluster in an environment that uses the dual-stack network, you might encounter pods stuck in the `ContainerCreating` state. This issue occurs because the `openshift-service-ca-operator` resource cannot generate the `metrics-tls` secret that the DNS pods need for DNS resolution. As a result, the pods cannot resolve the Kubernetes API server. To resolve this issue, configure the DNS server settings for a dual stack network. - -* If you created a hosted cluster in the same namespace as its managed cluster, detaching the managed hosted cluster deletes everything in the managed cluster namespace including the hosted cluster. The following situations can create a hosted cluster in the same namespace as its managed cluster: - -** You created a hosted cluster on the Agent platform through the {mce} console by using the default hosted cluster cluster namespace. -** You created a hosted cluster through the command-line interface or API by specifying the hosted cluster namespace to be the same as the hosted cluster name. - -* When you use the console or API to specify an IPv6 address for the `spec.services.servicePublishingStrategy.nodePort.address` field of a hosted cluster, a full IPv6 address with 8 hextets is required. For example, instead of specifying `2620:52:0:1306::30`, you need to specify `2620:52:0:1306:0:0:0:30`. - -* In {hcp} on {VirtProductName}, if you store all hosted cluster information in a shared namespace and then back up and restore a hosted cluster, you might unintentionally change other hosted clusters. To avoid this issue, back up and restore only hosted clusters that use labels, or avoid storing all hosted cluster information in a shared namespace. - -[id="hcp-tech-preview-features_{context}"] -=== General Availability and Technology Preview features - -Some features in this release are currently in Technology Preview. These experimental features are not intended for production use. For more information about the scope of support for these features, see link:https://access.redhat.com/support/offerings/techpreview[Technology Preview Features Support Scope] on the Red{nbsp}Hat Customer Portal. - -[IMPORTANT] -==== -For {ibm-power-title} and {ibm-z-title}, the following exceptions apply: - -* For version 4.20 and later, you must run the control plane on machine types that are based on 64-bit x86 architecture or s390x architecture, and node pools on {ibm-power-title} or {ibm-z-title}. -* For version 4.19 and earlier, you must run the control plane on machine types that are based on 64-bit x86 architecture, and node pools on {ibm-power-title} or {ibm-z-title}. -==== - -.{hcp-capital} GA and TP tracker -[cols="4,1,1,1",options="header"] -|=== -|Feature |4.18 |4.19 |4.20 - -|{hcp-capital} for {product-title} using non-bare-metal agent machines -|Technology Preview -|Technology Preview -|Technology Preview - -|{hcp-capital} for {product-title} on {rh-openstack} -|Developer Preview -|Technology Preview -|Technology Preview - -|Custom taints and tolerations -|Technology Preview -|Technology Preview -|Technology Preview - -|NVIDIA GPU devices on {hcp} for {VirtProductName} -|Technology Preview -|Technology Preview -|Technology Preview - -|{hcp-capital} on {ibm-z-title} in a disconnected environment -|Technology Preview -|Technology Preview -|Generally Available -|=== \ No newline at end of file diff --git a/modules/hcp-release-notes-fixed-issues.adoc b/modules/hcp-release-notes-fixed-issues.adoc index b31572696400..bae4c2f1f50d 100644 --- a/modules/hcp-release-notes-fixed-issues.adoc +++ b/modules/hcp-release-notes-fixed-issues.adoc @@ -1,55 +1,3 @@ :_mod-docs-content-type: REFERENCE [id="hcp-release-notes-fixed-issues_{context}"] = Fixed issues - -The following issues are fixed for this release: - -* Before this update, services in the hosted control plane namespace, such as the `aws-ebs-csi-driver-controller-metrics` service, used the `service-ca` annotation (`service.beta.openshift.io/serving-cert-secret-name`) to generate TLS certificates. As a consequence, control plane services incorrectly depended on the OpenShift Service CA Operator in the hosted cluster for certificate generation, which weakened the security boundary between the control plane and the hosted cluster. With this release, the Control Plane Operator creates and manages TLS certificates for the `aws-ebs-csi-driver-controller-metrics` service directly, signed by the hosted control plane root CA, eliminating the dependency on the OpenShift Service CA Operator. The implementation checks for `service-ca` annotations to ensure a smooth upgrade path from older deployments. As a result, control plane isolation and certificate lifecycle management are improved. (link:https://redhat.atlassian.net/browse/OCPBUGS-34662[OCPBUGS-34662]) - -* Before this update, the HyperShift Operator metrics collector validated the proxy CA bundle certificates on every metrics collection cycle. As a consequence, when a certificate in the CA bundle expired, repeated `proxy ca bundle is invalid` messages were posted in the HyperShift Operator logs without identifying the hosted cluster, making it difficult to diagnose the cluster with the invalid proxy CA certificate. With this release, certificate validation is moved to the `HostedCluster` reconcile loop, and a new `ValidProxyConfiguration` condition is added to the `HostedCluster` API. The metrics collector now reads the validation result from the condition instead of directly performing validation. As a result, the metrics collector no longer posts repeated messages in the logs, and affected clusters can be identified. (link:https://redhat.atlassian.net/browse/OCPBUGS-55151[OCPBUGS-55151]) - -* Before this update, KubeVirt virtual machines (VMs) used in node pools were not configured with an external eviction strategy. As a consequence, the Cluster API Provider for KubeVirt controller did not detect eviction requests during node drains on the underlying infrastructure cluster. Node drains were not coordinated properly, the hosted control plane function was disrupted, and pods failed when node pool VMs were shut down. With this release, KubeVirt VMs are configured with the external eviction strategy in the VM template specification. As a result, the Cluster API Provider for KubeVirt can detect eviction events and coordinate the draining of hosted cluster nodes during infrastructure operations. For VMs that support live migration, the Cluster API Provider for KubeVirt skips the drain process and allows the VMs to be migrated without disruption. (link:https://redhat.atlassian.net/browse/OCPBUGS-58397[OCPBUGS-58397]) - -* Before this update, when you removed the `additionalTrustBundle` field from the `HostedCluster` specification, the `additionalTrustBundle` certificate was not removed from the `user-ca-bundle` config map. As a consequence, it appeared that the `additionalTrustBundle` certificate was not removed from the hosted clusters. With this release, the reconciliation logic ensures that the `user-ca-bundle` config map is deleted from the hosted cluster when you delete the `additionalTrustBundle` field. As a result, when you delete the `additionalTrustBundle` field from the `HostedCluster` specification, the certificate is removed, improving security and consistency. (link:https://redhat.atlassian.net/browse/OCPBUGS-60707[OCPBUGS-60707]) - -* Before this update, the control plane deployments related to Cluster API (`cluster-api` and `capi-provider`) in the hosted control plane namespace lacked finalizers. As a consequence, if these deployments were deleted before the `HostedCluster` resource was deleted, the controller pods would stop running before they could process finalizers on their managed Cluster API resources (`Machine` objects, `MachineDeployment` objects, platform-specific infrastructure objects), leading to orphaned cloud resources such as EC2 instances, VMs, disks, and load balancers. With this update, the HyperShift Operator adds a finalizer, `hypershift.openshift.io/component-finalizer`, to the `cluster-api` and `capi-provider` deployments. The finalizer is only removed after the underlying infrastructure resources have been deleted during `HostedCluster` teardown. As a result, accidental deletion of these deployments is blocked until Cluster API resources are properly cleaned up, preventing orphaned cloud resources.(link:https://redhat.atlassian.net/browse/OCPBUGS-63452[OCPBUGS-63452]) - -* Before this update, when the `ValidAWSIdentityProvider` condition was copied from the control plane to the hosted cluster, the logic preserved the earlier status if the new condition was `Unknown`. As a consequence, when the earlier condition was `True` and the new condition was `Unknown`, the update was skipped. With this release, the condition on the hosted cluster correctly reflects the current health of the AWS Identity Provider referenced in the cloud credentials. (link:https://redhat.atlassian.net/browse/OCPBUGS-66325[OCPBUGS-66325]) - -* Before this update, the Cluster Network Operator failed to recognize KubeVirt as a supported platform for {hcp} with dual-stack networking. As a consequence, on deployments of {hcp} on OpenShift Virtualization with dual-stack networking, the Cluster Network Operator deployment failed. With this release, the Cluster Network Operator recognizes KubeVirt as a supported platform for {hcp} with dual-stack networking. As a result, deploying {hcp} on OpenShift Virtualization with IPv4/IPv6 dual-stack networking succeeds. (link:https://redhat.atlassian.net/browse/OCPBUGS-66417[OCPBUGS-66417]) - -* Before this update, the cluster autoscaler did not include the `hypershift.openshift.io/nodepool-globalps-enabled` label in its `--balancing-ignore-label` list. As a consequence, when the autoscaler balanced node groups, it treated nodes with and without this label as belonging to different groups, causing uneven scaling across nodes in the same `NodePool` object. With this update, the `hypershift.openshift.io/nodepool-globalps-enabled` label is added to the balancing ignore list of the autoscaler. As a result, the autoscaler distributes new nodes evenly across node groups regardless of the Global Pull Secret eligibility label. (link:https://redhat.atlassian.net/browse/OCPBUGS-73817[OCPBUGS-73817]) - -* Before this update, when you created a hosted cluster that used a `NodePort` publishing strategy, specifying a port outside the Kubernetes service node port range, such as `10000`, was silently accepted during cluster creation. As a consequence, the cluster installation got stuck with only 3 pods in the hosted cluster namespace, because the Control Plane Operator rejected the port for being outside the acceptable range of `30000` - `32767`, causing a late failure after resources were already provisioned. With this release, early validation is added for the `NodePort.Port` value against the configured `ServiceNodePortRange` parameter of the cluster. Invalid values are rejected upfront with a clear message indicating the allowed range. As a result, you receive an immediate validation error when you specify a `NodePort` that is outside the acceptable range, and avoid stuck cluster installations. (link:https://redhat.atlassian.net/browse/OCPBUGS-65824[OCPBUGS-65842]) - -* Before this update, the `hypershift.openshift.io/nodepool-globalps-enabled` label was applied to nodes by the Hosted Cluster Config Operator `globalps` controller, which discovered eligible nodes by querying `MachineSet` objects and `Machine` objects during its periodic reconciliation. As a consequence, when a new `Replace` node joined the cluster, the `global-pull-secret-syncer` DaemonSet pod could not schedule on it until the next reconcile cycle of the `globalps` controller, causing a delay of up to 15 minutes. With this update, the label is set directly on Cluster API `Machine` objects by the HyperShift Operator during `MachineDeployment` reconciliation, so it propagates to nodes at creation time by using the Hosted Cluster Config Operator Node controller. As a result, new `Replace` nodes on {aws-short} are immediately eligible for the `global-pull-secret-syncer` DaemonSet, eliminating the scheduling delay. (link:https://redhat.atlassian.net/browse/OCPBUGS-77966[OCPBUGS-77966]) - -* Before this update, the ignition server deployment computed registry overrides by performing live HTTP registry connectivity checks (`LookupMappedImage/GetMetadata`) during every Control Plane Operator reconciliation. As a consequence, network conditions caused the `--registry-overrides` argument and `MIRRORED_RELEASE_IMAGE` environment variable to return different values on each reconciliation, triggering constant deployment regenerations and pod restarts. With this update, the ignition server deployment uses the static registry overrides from the `HostedCluster` specification instead of performing live registry lookups at deploy time. The ignition server already resolves per-image mirrors at runtime by using its own override logic. As a result, ignition server deployments remain stable with consistent configuration, eliminating unnecessary pod restarts. (link:https://redhat.atlassian.net/browse/OCPBUGS-60185[OCPBUGS-60185]) - -* Before this update, when you created a hosted cluster that used a `NodePort` publishing strategy, specifying a port outside the Kubernetes service node port range, such as `10000`, was silently accepted during cluster creation. As a consequence, the cluster installation got stuck with only 3 pods in the hosted cluster namespace, because the Control Plane Operator rejected the port for being outside the acceptable range of `30000` - `32767`, causing a late failure after resources were already provisioned. With this release, early validation is added for the `NodePort.Port` value against the configured `ServiceNodePortRange` parameter of the cluster. Invalid values are rejected upfront with a clear message indicating the allowed range. As a result, you receive an immediate validation error when you specify a `NodePort` that is outside the acceptable range, and avoid stuck cluster installations. (link:https://redhat.atlassian.net/browse/OCPBUGS-65824[OCPBUGS-65842]) - -* Before this update, when the `allowedCIDRBlocks` parameter was removed from the `HostedCluster` specification, the `LoadBalancerSourceRanges` field on the external router `LoadBalancer` service was not cleared. As a consequence, stale Classless Inter-Domain Routing (CIDR) restrictions remained on the router service after the administrator removed the access restrictions, continuing to block traffic that should have been allowed. With this update, the reconciliation logic always sets the `LoadBalancerSourceRanges` field on the external router service to match the current `allowedCIDRBlocks` value, including clearing it when the list is empty. As a result, removing the `allowedCIDRBlocks` parameter from the `HostedCluster` specification correctly removes the CIDR restrictions from the router service.(link:https://redhat.atlassian.net/browse/OCPBUGS-69761[OCPBUGS-69761]) - -* Before this update, the `HostedControlPlane` controller set the `HostedControlPlaneAvailable` condition to `True` after the Kubernetes API server was reachable, without verifying that all control plane components had finished rolling out. As a consequence, customers could interact with the cluster before components such as the `kube-controller-manager`, `oauth-server`, or `kube-scheduler` were fully ready, which could lead to failures or unexpected behavior. With this update, the controller now lists all control plane component resources in the hosted control plane namespace and verifies that each has its `Available` condition set to `True` before setting the `HostedControlPlaneAvailable` condition to True. If any components are not yet available, the condition reports the `ComponentsNotAvailable` reason with a message listing the pending components. After the cluster reaches the available state, later component rollouts, such as during upgrades, do not flip the condition back to `False`. As a result, the hosted control plane now only reports `Available=True` after all control plane components have completed their initial rollout, ensuring a more reliable user experience. (link:https://redhat.atlassian.net/browse/OCPBUGS-74648[OCPBUGS-74648]) - -* Before this update, the Hosted Cluster Config Operator contained logic that modified the `openshift-controller-manager-config` config map to disable the `serviceaccount-pull-secrets` controller when the `managementState` parameter of the image registry was set to `Removed`. In {product-title} 4.20 and later, Control Plane Operator v2 started managing this config map, but the Hosted Cluster Config Operator continued modifying it on every reconciliation cycle. As a consequence, the `openshift-controller-manager-config` config map was updated by Hosted Cluster Config Operator every minute, which triggered the `openshift-controller-manager` file observer to detect changes and restart pods. This behavior caused constant `openshift-controller-manager` pod restarts. With this release, the OpenShift Controller Manager config update logic is removed from the Hosted Cluster Config Operator because Control Plane Operator v2 manages the `openshift-controller-manager-config` config map. As a result, the `openshift-controller-manager` pods no longer experience unnecessary restarts. (link:https://redhat.atlassian.net/browse/OCPBUGS-74931[OCPBUGS-74931]) - -* Before this update, during the backup and restore process with OADP, the token secret was deleted before the `NodePool` object was restored. Then, the `NodePool` controller created a token secret without the `ignition-reached` annotation. Because nodes were already running, they did not contact the ignition endpoint again, so the annotation was never set back. As a consequence, the `ReachedIgnitionEndpoint` condition stayed `False`, blocking machine health check creation and disabling auto-repair for the restored node pools. With this release, when the `HostedCluster` object has the `hypershift.openshift.io/restored-from-backup` annotation set by the OADP plugin, the token secret is created with the `ignition-reached=True` parameter, preserving the condition across the restore process. As a result, after a backup and restore process, node pools correctly report `ReachedIgnitionEndpoint=True` so that the machine health check and auto-repair work as expected. (link:https://redhat.atlassian.net/browse/OCPBUGS-77621[OCPBUGS-77621]) - -* Before this update, when deploying hosted clusters with a 4.21 or later payload, the HyperShift Operator used hard-coded `quay.io` image references for the Cluster API manager and platform-specific Cluster API provider containers. These hard-coded images bypassed the standard release payload image lookup, which respects `ImageContentSourcePolicies` (ICSPs) and `ImageDigestMirrorSets` (IDMSs). As a consequence, in disconnected or mirrored environments, Cluster API images were always pulled directly from `quay.io` even when registry overrides were configured, causing image pull failures and preventing cluster creation. With this update, the backward-compatible Cluster API image references are resolved by looking up the component from a pinned 4.20.10 release payload through the standard release image provider, which correctly follows registry override configuration. As a result, Cluster API images are pulled from the correct mirror registry in disconnected environments. For this fix to work, the 4.20.10 release payload from the `quay.io/openshift-release-dev/ocp-release:4.20.10-multi` image must be mirrored to the target mirror registry. (link:https://redhat.atlassian.net/browse/OCPBUGS-74247[OCPBUGS-74247]) - -* Before this update, requests from the Kubernetes API server bootstrap container were denied by a validating admission policy that restricts feature gate changes to a specific user. As a consequence, the bootstrap container was unable to apply feature gate changes, causing control plane issues. With this release, a dedicated identity is created for the Kubernetes API server bootstrap container and is allow-listed in the policy. As a result, the bootstrap container can apply feature gate changes without being denied by the validating admission policy. (link:https://redhat.atlassian.net/browse/OCPBUGS-50603[OCPBUGS-50603]) - -* Before this update, when a predicate of a Control Plane Operator v2 component evaluated to `false`, the framework tried to look up and clean up the associated resource by using the cached client. For resource types not installed on the management cluster, such as the `SecretProviderClass` custom resource definition of the Secrets Store CSI driver, this caused the cached client to create an informer that retried list and watch actions indefinitely, blocking all control plane reconciliation. As a consequence, hosted cluster creation failed on management clusters that did not have the Secrets Store CSI driver custom resource definition installed. With this update, the Control Plane Operator probes whether a resource type is accessible on the management cluster before trying to interact with it. If the custom resource definition is not installed or the operator lacks role-based access permission, the operation is skipped gracefully and the result is cached. As a result, hosted cluster creation succeeds even when optional custom resource definitions such as the Secrets Store CSI driver are not present on the management cluster. (link:https://redhat.atlassian.net/browse/OCPBUGS-65687[OCPBUGS-65687]) - -* Before this update, when using a custom API server DNS name with external DNS, the `kubeconfig` secret contained an incorrect port. As a consequence, connections to the API server failed with reset errors. With this update, the `kubeconfig` uses the correct port for the configured DNS setup. As a result, external DNS connections work as expected. (link:https://redhat.atlassian.net/browse/OCPBUGS-72258[OCPBUGS-72258]) - -* Before this update, a race condition in `VolumeSnapshot` processing where a snapshot was deleted between listing and retrieving was treated as an unrecoverable error, ending the processing of remaining snapshots. As a consequence, intermittent backup failures (about 25% of scheduled backups) were marked as `PartiallyFailed` with missing etcd PVC data. With this release, deleted snapshots are gracefully skipped instead of treated as unrecoverable errors, allowing the remaining snapshots to be processed normally. As a result, backups are completed successfully even when snapshot cleanup races with plugin processing. (link:https://redhat.atlassian.net/browse/OCPBUGS-75913[OCPBUGS-75913]) - -* Before this update, when the scale-from-zero feature was enabled in {aws-short} and a node pool used the `InPlace` node upgrade type with autoscaling set to `min=0`, the scale-from-zero implementation did not support the `InPlace` upgrade strategy. The original implementation used a machine deployment controller approach that only worked with the `Replace` upgrade strategy. As a consequence, new workloads did not trigger node pool scale-up from zero when using the `InPlace` upgrade type, preventing nodes from being created even when pods were pending. With this release, the scale-from-zero implementation uses a generic provider pattern that works with all upgrade types. As a result, node pools that use the `InPlace` upgrade type can scale up from zero when workload demands require additional capacity. The autoscaler correctly provisions nodes regardless of the upgrade strategy. (link:https://redhat.atlassian.net/browse/OCPBUGS-70320[OCPBUGS-70320]) - -* Before this update, a race condition in the `globalps` controller skipped labeling new nodes, causing a delay in `global-pull-secret-syncer` pod scheduling. As a consequence, users experienced image pull failures from private registries on new nodes. With this release, the scheduling delay is resolved, fixing the race condition in the Hosted Cluster Config Operator. As a result, the `global-pull-secret-syncer` pod now schedules immediately on new nodes, ensuring timely access to private images. (link:https://redhat.atlassian.net/browse/OCPBUGS-77254[OCPBUGS-77254]) - -* Before this update, the `IsCloudAPI` method for the `Konnectivity` proxy did not include {aws-first} ISO region domain suffixes, such as `.c2s.ic.gov`, `.hci.ic.gov`, or `.sc2s.sgov.gov` in its cloud API detection lists. As a consequence, the Ingress Operator could not add {aws-short} ISO domains to the `NO_PROXY` list, blocking direct communication with endpoints. With this release, the {aws-short} ISO suffixes are added to the `IsCloudAPI` detection list. As a result, the `Konnectivity` proxy correctly identifies the {aws-short} ISO region endpoints as cloud APIs, so that the Ingress Operator can route traffic to those domains directly. (link:https://redhat.atlassian.net/browse/OCPBUGS-85779[OCPBUGS-85779]) - -* Before this update, when you deployed a hosted cluster on OpenShift Virtualization with external infrastructure, the `virt-launcher` network policy was not created on the infrastructure cluster where the KubeVirt `virt-launcher` pods and virtual machines (VM)s run. As a consequence, the KubeVirt VMs had unrestricted network access to all pods and services on the infrastructure cluster, breaking tenant isolation. With this release, the `virt-launcher` network policy is created with CIDR-based egress restrictions. As a result, multitenant isolation is no longer compromised. (link:https://redhat.atlassian.net/browse/OCPBUGS-78575[OCPBUGS-78575]) diff --git a/modules/hcp-release-notes-new-features.adoc b/modules/hcp-release-notes-new-features.adoc index e97c74c7a23b..9b13841c4470 100644 --- a/modules/hcp-release-notes-new-features.adoc +++ b/modules/hcp-release-notes-new-features.adoc @@ -12,22 +12,3 @@ Item description:: Detailed information. //// -Monitor connectivity from the data plane to the control plane:: -+ -In this release, you can monitor connectivity from the data plane to the control plane by using the `ControlPlaneConnectionAvailable` condition. For more information, see xref:../hosted_control_planes/hcp-observability.adoc#hcp-connect-control-plane_hcp-observability[Connectivity monitoring from the data plane to the control plane]. - -Implement network segmentation for hosted clusters:: -+ -In this release, you can configure network isolation for hosted clusters with container-based isolation, VM-based isolation, or physical isolation. For more information, see xref:../hosted_control_planes/hcp-networking.adoc#hcp-isolation-overview_hcp-networking[Network isolation for hosted clusters]. - -Enable Amazon Spot Instance support:: -+ -In this release, you can enable Amazon Spot Instance support for compute nodes to reduce cloud infrastructure costs. Amazon Spot Instances are suitable for hosted cluster workloads that are fault-tolerant, stateless, and flexible. For more information, see xref:../hosted_control_planes/hcp-manage/hcp-manage-aws.adoc#hcp-aws-spot-instance_hcp-managing-aws[Amazon Spot Instance support for node pools]. - -Back up etcd data for {hcp} by using the etcd snapshot method (Technology Preview):: -+ -As an alternative for the default volume snapshot approach, you can use the etcd snapshot approach to back up and restore etcd data for {hcp}. The etcd snapshot method is a Technology Preview feature. For more information, see xref:../hosted_control_planes/hcp_high_availability/hcp-backup-etcd-snapshot.adoc#hcp-backup-etcd-snapshot[Backing up etcd data for {hcp} by using the etcd snapshot method]. - -Deploy self-managed {hcp} on {azure-first} (Technology Preview):: -+ -In this release, you can create public or private hosted clusters on {azure-short} as a Technology Preview feature. For more information, see xref:../hosted_control_planes/hcp-deploy/hcp-deploy-azure.adoc#hcp-deploy-azure[Deploying {hcp} on {azure-short}]. diff --git a/modules/hcp-release-notes-notable-changes.adoc b/modules/hcp-release-notes-notable-changes.adoc index 361d03741116..7b295d8bb1c5 100644 --- a/modules/hcp-release-notes-notable-changes.adoc +++ b/modules/hcp-release-notes-notable-changes.adoc @@ -9,6 +9,3 @@ [role="_abstract"] Review the following notable technical changes introduced in this release. -{hcp-capital} on {ibm-z-title} in a disconnected environment now available:: -+ -With this release, {hcp} on {ibm-z-name} in a disconnected environment is a General Availability feature. In earlier versions, it was a Technology Preview feature. diff --git a/snippets/hcp-snippet.adoc b/snippets/hcp-snippet.adoc index fbebfc671e8f..3cb6346d6fc9 100644 --- a/snippets/hcp-snippet.adoc +++ b/snippets/hcp-snippet.adoc @@ -1,10 +1,10 @@ // Text snippet included in the following assemblies: // -// * hosted_control_planes/hosted-control-planes-release-notes.adoc +// * hosted_control_planes/hcp-release-notes.adoc :_mod-docs-content-type: SNIPPET [IMPORTANT] ==== -{hcp-capital} for {product-title} {product-version} is planned to be available with an upcoming release of {mce-short}. In the meantime, see the link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.20/html/hosted_control_planes/hosted-control-planes-release-notes-1[{hcp} documentation for {product-title} 4.20]. +{hcp-capital} for {product-title} {product-version} is planned to be available with an upcoming release of {mce-short}. In the meantime, see the link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.21/html/hosted_control_planes/hcp-release-notes[{hcp} documentation for {product-title} 4.21]. ==== \ No newline at end of file