-
Notifications
You must be signed in to change notification settings - Fork 232
USHIFT-6810: Add C2CC upgrade test for RHEL 9.8 to RHEL 10.2 #6894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
bc9201e
393d045
d5bc734
777fb0e
0be2317
618f5d4
b3c614b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -43,3 +43,10 @@ RUN firewall-offline-cmd --zone=public --add-port=22/tcp && \ | |
| # when upgrading from older ostree commits to bootc container layers | ||
| RUN mkdir -p /usr/lib/systemd/system/ovsdb-server.service.d | ||
| COPY --chmod=644 ./bootc-images/microshift-ovsdb-ownership.conf /usr/lib/systemd/system/ovsdb-server.service.d/microshift-ovsdb-ownership.conf | ||
|
|
||
| # Fix SSH host key permissions for cross-version upgrades (RHEL 9 uses 0640 | ||
| # with ssh_keys group, RHEL 10 sshd requires 0600) | ||
| # Similar workaround is used in microshift-test-agent.sh | ||
| RUN mkdir -p /usr/lib/systemd/system/sshd.service.d && \ | ||
| printf '[Service]\nExecStartPre=/bin/bash -c "chmod 600 /etc/ssh/ssh_host_*_key 2>/dev/null || true"\n' \ | ||
| > /usr/lib/systemd/system/sshd.service.d/fix-hostkey-perms.conf | ||
|
Comment on lines
+47
to
+52
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We have |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -12,6 +12,16 @@ function enable_eus_repositories() { | |
| "rhel-${vmajor}-for-$(uname -m)-appstream-eus-rpms" | ||
| } | ||
|
|
||
| function enable_ga_repositories() { | ||
| local -r vmajor="$(awk -F. '{print $1}' <<< "${VERSION_ID}")" | ||
|
|
||
| dnf config-manager --set-disabled '*' | ||
| dnf config-manager --set-enabled \ | ||
| "rhel-${vmajor}-for-$(uname -m)-baseos-rpms" \ | ||
| "rhel-${vmajor}-for-$(uname -m)-appstream-rpms" | ||
| } | ||
|
|
||
|
|
||
| # Lock the OS release version to prevent the installation of packages belonging | ||
| # to a newer OS release | ||
| source /etc/os-release | ||
|
|
@@ -24,6 +34,10 @@ while [ $# -gt 0 ] ; do | |
| enable_eus_repositories | ||
| shift | ||
| ;; | ||
| --enable-ga) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where this new
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as below: this was coming from a previous change that used 10.2 GA repos (which we discussed offline), I thought it was worth to keep it, but I'm happy to remove if it's too much of the scope of the PR
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if this not needed for this PR, I prefer to remove it |
||
| enable_ga_repositories | ||
| shift | ||
| ;; | ||
|
Comment on lines
+37
to
+40
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this belong to this PR?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was coming from a previous change that used 10.2 GA repos (which we discussed offline), I thought it was worth to keep it, but I'm happy to remove if it's too much of the scope of the PR
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's move unrelated stuff to another PR - it would be to coordinate with the effort to reenable GA repos though |
||
| --disable-all) | ||
| dnf config-manager --set-disabled '*' | ||
| shift | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -77,7 +77,7 @@ Teardown All Remote Clusters | |
| END | ||
| VAR @{C2CC_REMOTE_ALIASES}= @{EMPTY} scope=SUITE | ||
| ${local_conn}= Get From Dictionary ${C2CC_SSH_IDS} cluster-a | ||
| SSHLibrary.Switch Connection ${local_conn} | ||
| ${status}= Run Keyword And Return Status SSHLibrary.Switch Connection ${local_conn} | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need this? |
||
|
|
||
| Command On Cluster | ||
| [Documentation] Run a shell command on the specified cluster via SSH. | ||
|
|
@@ -384,6 +384,26 @@ DNS Lookup Should Succeed | |
| ... oc exec curl-pod -n ${NAMESPACES}[${alias}] -- getent hosts ${fqdn} | ||
| Should Not Be Empty ${stdout} | ||
|
|
||
| Verify RemoteCluster State | ||
| [Documentation] Check that all RemoteCluster CRs on this cluster have the expected state. | ||
| [Arguments] ${alias} ${expected_state} | ||
| ${stdout}= Oc On Cluster ${alias} | ||
| ... oc get remoteclusters.microshift.io -o jsonpath='{.items[*].status.state}' | ||
| Should Not Be Empty ${stdout} | ||
| @{states}= Split String ${stdout} | ||
| ${count}= Get Length ${states} | ||
| Should Be Equal As Integers ${count} 2 Expected 2 RemoteCluster states, got ${count} | ||
| FOR ${state} IN @{states} | ||
| Should Be Equal As Strings ${state} ${expected_state} | ||
| END | ||
|
|
||
| Verify All RemoteClusters Healthy | ||
| [Documentation] Wait for all RemoteCluster CRs on all clusters to report Healthy. | ||
| FOR ${alias} IN cluster-a cluster-b cluster-c | ||
| Wait Until Keyword Succeeds 3m 10s | ||
| ... Verify RemoteCluster State ${alias} Healthy | ||
| END | ||
|
|
||
| Curl DNS From Cluster | ||
| [Documentation] Curl a service by DNS name from curl-pod on the given cluster. | ||
| [Arguments] ${alias} ${fqdn} ${port} | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| #!/bin/bash | ||
|
|
||
| # Sourced from scenario.sh and uses functions defined there. | ||
|
|
||
| # shellcheck source=test/bin/c2cc_common.sh | ||
| source "${SCRIPTDIR}/c2cc_common.sh" | ||
|
|
||
| export TEST_RANDOMIZATION=none | ||
| export TEST_EXECUTION_TIMEOUT=60m | ||
|
|
||
| C2CC_TARGET_REF=rhel102-bootc-source | ||
|
|
||
| scenario_create_vms() { | ||
| c2cc_create_vms rhel98-bootc-source rhel98-bootc | ||
| } | ||
|
|
||
| scenario_remove_vms() { | ||
| c2cc_remove_vms | ||
| } | ||
|
|
||
| scenario_run_tests() { | ||
| # shellcheck disable=SC2119 | ||
| configure_c2cc_hosts | ||
| c2cc_run_tests "suites/upgrade/upgrade-c2cc.robot" | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1 @@ | ||
| ../../c2cc/el98-src@el102-src@c2cc-upgrade-ok.sh | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can drop this one |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -151,19 +151,6 @@ Verify Probe Service ClusterIP | |
| ... oc get service ${PROBE_DEPLOYMENT} -n ${C2CC_NAMESPACE} -o jsonpath='{.spec.clusterIP}' | ||
| Should Be Equal As Strings ${actual_ip} ${expected_ip} strip_spaces=True | ||
|
|
||
| Verify RemoteCluster State | ||
| [Documentation] Check that all RemoteCluster CRs on this cluster have the expected state. | ||
| [Arguments] ${alias} ${expected_state} | ||
| ${stdout}= Oc On Cluster ${alias} | ||
| ... oc get remoteclusters.microshift.io -o jsonpath='{.items[*].status.state}' | ||
| Should Not Be Empty ${stdout} | ||
| @{states}= Split String ${stdout} | ||
| ${count}= Get Length ${states} | ||
| Should Be Equal As Integers ${count} 2 Expected 2 RemoteCluster states, got ${count} | ||
| FOR ${state} IN @{states} | ||
| Should Be Equal As Strings ${state} ${expected_state} | ||
| END | ||
|
|
||
| Verify RemoteCluster State By Name | ||
| [Documentation] Check that a specific RemoteCluster CR has the expected state. | ||
| [Arguments] ${alias} ${cr_name} ${expected_state} | ||
|
|
@@ -189,10 +176,7 @@ RemoteCluster CR Name From IP | |
|
|
||
| Ensure All Clusters Healthy | ||
| [Documentation] Pre-condition: all clusters must be Healthy before fault injection. | ||
| FOR ${alias} IN cluster-a cluster-b cluster-c | ||
| Wait Until Keyword Succeeds 3m 10s | ||
| ... Verify RemoteCluster State ${alias} Healthy | ||
| END | ||
| Verify All RemoteClusters Healthy | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we still need
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. good catch, this used to be different initially :D |
||
|
|
||
| Apply Probe Deny Policy | ||
| [Documentation] Apply a NetworkPolicy that denies all ingress to the probe pod. | ||
|
|
||
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,169 @@ | ||||||||
| *** Settings *** | ||||||||
| Documentation Tests RHEL 9.8 to RHEL 10.2 upgrade with C2CC enabled across 3 clusters. | ||||||||
| ... Upgrades each cluster one by one and verifies C2CC connectivity | ||||||||
| ... survives at each stage. | ||||||||
|
|
||||||||
| Resource ../../resources/common.resource | ||||||||
| Resource ../../resources/c2cc.resource | ||||||||
| Resource ../../resources/microshift-host.resource | ||||||||
| Library Collections | ||||||||
| Library SSHLibrary | ||||||||
|
|
||||||||
| Suite Setup Setup | ||||||||
| Suite Teardown Teardown | ||||||||
|
|
||||||||
| Test Tags c2cc ostree | ||||||||
|
|
||||||||
|
|
||||||||
| *** Variables *** | ||||||||
| ${TARGET_REF} ${EMPTY} | ||||||||
| ${BOOTC_REGISTRY} ${EMPTY} | ||||||||
| ${CLUSTER_A_POD_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_A_SVC_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_A_DOMAIN} ${EMPTY} | ||||||||
| ${CLUSTER_B_POD_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_B_SVC_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_B_DOMAIN} ${EMPTY} | ||||||||
| ${KUBECONFIG_B} ${EMPTY} | ||||||||
| ${CLUSTER_C_POD_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_C_SVC_CIDR} ${EMPTY} | ||||||||
| ${CLUSTER_C_DOMAIN} ${EMPTY} | ||||||||
| ${KUBECONFIG_C} ${EMPTY} | ||||||||
|
|
||||||||
|
|
||||||||
| *** Test Cases *** | ||||||||
| Upgrade C2CC Clusters From RHEL9 To RHEL10 | ||||||||
| [Documentation] Upgrades 3 C2CC-connected clusters one by one from RHEL 9.8 | ||||||||
| ... to RHEL 10.2 and verifies health and C2CC connectivity after each upgrade. | ||||||||
|
|
||||||||
| Verify All Clusters Healthy | ||||||||
| Verify All RemoteClusters Healthy | ||||||||
| Deploy Test Workloads | ||||||||
| Verify Full C2CC Connectivity | ||||||||
|
|
||||||||
| FOR ${alias} IN cluster-a cluster-b cluster-c | ||||||||
| Log To Console Upgrading ${alias} to ${TARGET_REF} | ||||||||
| Upgrade Cluster ${alias} | ||||||||
| Verify All Clusters Healthy | ||||||||
| Verify All RemoteClusters Healthy | ||||||||
| Wait For Test Pods | ||||||||
| Wait For Service Endpoints | ||||||||
| Verify Full C2CC Connectivity | ||||||||
| END | ||||||||
|
|
||||||||
| [Teardown] Cleanup Test Workloads | ||||||||
|
|
||||||||
|
|
||||||||
| *** Keywords *** | ||||||||
| Setup | ||||||||
| [Documentation] Register all three clusters for SSH and oc access | ||||||||
| ... and store connection details for reconnection after reboots. | ||||||||
| Check Required Env Variables | ||||||||
| Should Not Be Empty ${TARGET_REF} TARGET_REF variable is required | ||||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for consistency I think we should also add a check for
Suggested change
|
||||||||
| Login MicroShift Host | ||||||||
| Setup Kubeconfig | ||||||||
| Logout MicroShift Host | ||||||||
|
|
||||||||
| Register Remote Cluster cluster-a ${USHIFT_HOST} ${SSH_PORT} ${KUBECONFIG} | ||||||||
| Register Remote Cluster cluster-b ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B} | ||||||||
| Register Remote Cluster cluster-c ${HOST3_IP} ${HOST3_SSH_PORT} ${KUBECONFIG_C} | ||||||||
|
|
||||||||
| Teardown | ||||||||
| [Documentation] Close all connections and clean up. | ||||||||
| Teardown All Remote Clusters | ||||||||
| Remove Kubeconfig | ||||||||
|
|
||||||||
| Verify All Clusters Healthy | ||||||||
| [Documentation] Verify all clusters are running and API server is reachable. | ||||||||
| FOR ${alias} IN cluster-a cluster-b cluster-c | ||||||||
| ${stdout}= Oc On Cluster ${alias} oc get --raw='/readyz' | ||||||||
| Should Be Equal As Strings ${stdout} ok strip_spaces=True | ||||||||
| END | ||||||||
|
|
||||||||
| Verify Full C2CC Connectivity | ||||||||
| [Documentation] Verify pod-to-pod and pod-to-service connectivity between all cluster pairs. | ||||||||
| VAR @{clusters}= cluster-a cluster-b cluster-c | ||||||||
| FOR ${src} IN @{clusters} | ||||||||
| FOR ${dst} IN @{clusters} | ||||||||
| IF '${src}' != '${dst}' | ||||||||
| Test Connectivity Between Clusters ${src} ${dst} pod | ||||||||
| Test Connectivity Between Clusters ${src} ${dst} service | ||||||||
| END | ||||||||
| END | ||||||||
| END | ||||||||
|
|
||||||||
| Upgrade Cluster | ||||||||
| [Documentation] Upgrade a specific cluster to the target bootc image | ||||||||
| ... and verify it booted into the new deployment without rollback. | ||||||||
| [Arguments] ${alias} | ||||||||
|
|
||||||||
| ${initial_deploy_id}= Get Deployment Id On Cluster ${alias} | ||||||||
|
|
||||||||
| Command On Cluster | ||||||||
| ... ${alias} | ||||||||
| ... printf '[[registry]]\nlocation = "${BOOTC_REGISTRY}"\ninsecure = true\n' | sudo tee /etc/containers/registries.conf.d/999-microshift-insecure-registry.conf > /dev/null | ||||||||
|
|
||||||||
| Command On Cluster ${alias} bootc switch --quiet ${BOOTC_REGISTRY}/${TARGET_REF} | ||||||||
|
|
||||||||
| Command On Cluster ${alias} | ||||||||
| ... rm -f /etc/containers/registries.conf.d/999-microshift-insecure-registry.conf | ||||||||
|
|
||||||||
| Reboot Cluster And Wait ${alias} | ||||||||
|
|
||||||||
| ${current_deploy_id}= Get Deployment Id On Cluster ${alias} | ||||||||
| Should Not Be Equal As Strings ${current_deploy_id} ${initial_deploy_id} | ||||||||
| ... msg=${alias} rolled back to initial deployment | ||||||||
|
|
||||||||
| Get Deployment Id On Cluster | ||||||||
| [Documentation] Get the booted image digest from a specific cluster. | ||||||||
| [Arguments] ${alias} | ||||||||
| ${stdout}= Command On Cluster | ||||||||
| ... ${alias} | ||||||||
| ... bootc status --booted --json | python3 -c "import sys,json; print(json.load(sys.stdin)['status']['booted']['image']['imageDigest'])" | ||||||||
| RETURN ${stdout} | ||||||||
|
|
||||||||
| Reboot Cluster And Wait | ||||||||
| [Documentation] Reboot a cluster and wait for it to come back with greenboot healthy. | ||||||||
| [Arguments] ${alias} | ||||||||
|
|
||||||||
| ${boot_id}= Command On Cluster ${alias} | ||||||||
| ... cat /proc/sys/kernel/random/boot_id sudo_mode=False | ||||||||
|
|
||||||||
| Disruptive Command On Cluster ${alias} reboot | ||||||||
|
|
||||||||
| Wait Until Keyword Succeeds 10m 15s | ||||||||
| ... Cluster Rebooted And Healthy ${alias} ${boot_id} | ||||||||
|
|
||||||||
| Cluster Rebooted And Healthy | ||||||||
| [Documentation] Verify cluster has rebooted and greenboot health check passed. | ||||||||
| [Arguments] ${alias} ${old_boot_id} | ||||||||
|
|
||||||||
| ${old_conn_id}= Get From Dictionary ${C2CC_SSH_IDS} ${alias} | ||||||||
| ${status}= Run Keyword And Return Status | ||||||||
| ... SSHLibrary.Switch Connection ${old_conn_id} | ||||||||
| IF ${status} SSHLibrary.Close Connection | ||||||||
|
|
||||||||
| ${host} ${port} ${kc}= Get Cluster Connection Info ${alias} | ||||||||
| Remove Values From List ${C2CC_REMOTE_ALIASES} ${alias} | ||||||||
| Register Remote Cluster ${alias} ${host} ${port} ${kc} | ||||||||
|
|
||||||||
| ${new_boot_id}= Command On Cluster ${alias} | ||||||||
| ... cat /proc/sys/kernel/random/boot_id sudo_mode=False | ||||||||
| Should Not Be Equal As Strings ${old_boot_id} ${new_boot_id} strip_spaces=True | ||||||||
|
|
||||||||
| ${stdout}= Command On Cluster ${alias} | ||||||||
| ... systemctl show -p SubState greenboot-healthcheck.service --value | ||||||||
| Should Be Equal As Strings ${stdout} exited strip_spaces=True | ||||||||
|
coderabbitai[bot] marked this conversation as resolved.
|
||||||||
|
|
||||||||
| Get Cluster Connection Info | ||||||||
| [Documentation] Return host, port, and kubeconfig for a given cluster alias. | ||||||||
| [Arguments] ${alias} | ||||||||
| IF '${alias}' == 'cluster-a' | ||||||||
| RETURN ${USHIFT_HOST} ${SSH_PORT} ${KUBECONFIG} | ||||||||
| ELSE IF '${alias}' == 'cluster-b' | ||||||||
| RETURN ${HOST2_IP} ${HOST2_SSH_PORT} ${KUBECONFIG_B} | ||||||||
| ELSE IF '${alias}' == 'cluster-c' | ||||||||
| RETURN ${HOST3_IP} ${HOST3_SSH_PORT} ${KUBECONFIG_C} | ||||||||
| ELSE | ||||||||
| Fail Unknown cluster alias: ${alias} | ||||||||
| END | ||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think we need to wrap
$BOOTC_REGISTRYin an if block to check if it's defined because this always set incommon.shbut it's fine as it is.