Skip to content

Re-use edpm-ansible driver installation code#14

Merged
csibbitt merged 1 commit intomainfrom
csibbitt/OSPRH-22196_reuse_edpm_driver_role
Mar 11, 2026
Merged

Re-use edpm-ansible driver installation code#14
csibbitt merged 1 commit intomainfrom
csibbitt/OSPRH-22196_reuse_edpm_driver_role

Conversation

@csibbitt
Copy link
Copy Markdown
Contributor

  • This is using the edpm-ansible role for an unintended purpose
  • But it means we don't have to maintain a second code base

* This is using the edpm-ansible role for an unintended purpose
* But it means we don't have to maintain a second code base
@csibbitt
Copy link
Copy Markdown
Contributor Author

Testing

I tested this against our nova04delta cloud and it worked as expected. The VM was already provisioned before the test, so the output below is just for the driver installation and testing (--skip-tags rhoso).

$ cat inventory-local 
[control_node]
localhost ansible_connection=local

[target_nodes]
gpu-validation-0 ansible_host=192.168.25.222 ansible_user=cloud-user ansible_ssh_common_args='-i ~/.ssh/id_brq2_dataplane -o StrictHostKeyChecking=no -o ProxyCommand="ssh -W %h:%p -o StrictHostKeyChecking=no -i ~/.ssh/id_brq2_dataplane cloud-admin@<REDACTED>"'


$ cat vars-local.yaml 
---
gpu_validation_private_key_file: ~/.ssh/id_brq2_dataplane
gpu_validation_ca_cert_path: ./tls-ca-bundle.pem
gpu_validation_dns_server: 192.168.25.10
gpu_validation_floating_ip: 192.168.25.222
gpu_validation_ssh_proxy_command: '-o ProxyCommand="ssh -W %h:%p -o StrictHostKeyChecking=no -i ~/.ssh/id_brq2_dataplane cloud-admin@<REDACTED>"'
gpu_validation_pci_devices:
  10de:27b8: 1
gpu_validation_model_tests_enabled: false


$ JUNIT_OUTPUT_DIR=./ ansible-playbook --skip-tags rhoso -i inventory-local -e @vars-local.yaml main.yaml
[...]

TASK [Nvidia Setup (Use the edpm_accel_drivers role to install the NVIDIA driver in our VM)] ******************************************************************************************
included: osp.edpm.edpm_accel_drivers for gpu-validation-0
[...]

TASK [osp.edpm.edpm_accel_drivers : Add nvidia driver repo] *******************************************************************************************************************
changed: [gpu-validation-0]]
[...]

TASK [osp.edpm.edpm_accel_drivers : Enable NVIDIA driver DNF module for selected version stream] ******************************************************************************
changed: [gpu-validation-0]
[...]

TASK [osp.edpm.edpm_accel_drivers : Install the nvidia driver package] ********************************************************************************************************
changed: [gpu-validation-0]
[...]


TASK [gpu-validation : Run lspci command and save output] *********************************************************************************************************************
ok: [gpu-validation-0]

TASK [gpu-validation : Set found_nvidia to true if NVIDIA is found] ***********************************************************************************************************
ok: [gpu-validation-0]

TASK [gpu-validation : TEST[gpus] Check if GPUs in Passthrough mode are present in RHEL AI VM (lspci)] ************************************************************************
ok: [gpu-validation-0] => (item={'key': '10de:27b8', 'value': 1})
[...]

TASK [gpu-validation : TEST[nvidia] Check if GPUs in Passthrough mode are present in RHEL AI VM (nvidia-smi)] *****************************************************************
ok: [gpu-validation-0] => {
    "changed": false,
    "msg": "All assertions passed"
}
[...]

TASK [gpu-validation : TEST[CUDA] Run the CUDA Sanity Check] ******************************************************************************************************************
ok: [gpu-validation-0]
[...]

PLAY RECAP ********************************************************************************************************************************************************************
gpu-validation-0           : ok=31   changed=7    unreachable=0    failed=0    skipped=33   rescued=0    ignored=0   

Copy link
Copy Markdown
Contributor

@bogdando bogdando left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Copy Markdown

@skovili skovili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of no second codebase to manage. and I see that the testing results are promising..so im ok to go ahead.

Copy link
Copy Markdown
Contributor

@MiguelCarpio MiguelCarpio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@csibbitt csibbitt merged commit 289cd35 into main Mar 11, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants