-
Notifications
You must be signed in to change notification settings - Fork 23
Skc baremetal environment #2017
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
skc-baremetal-environment
…hpc-kayobe-config into skc-baremetal-environment
…pc/stackhpc-kayobe-config into skc-baremetal-environment
| vars: | ||
| venv: "{{ virtualenv_path }}/openstack-cli" | ||
| tasks: | ||
| - name: Set up openstack cli virtualenv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be a separate script that we add into baremetal-all? like baremetal-env.yml or similar? With a clear error message if the venv is missing.
| - kayobe_bmc_up == "" | ||
| - ironic_redfish_username is defined | ||
|
|
||
| # - name: Wait 300 seconds for port 443 to become open |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need these we should remove them I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 lets remove it.
etc/kayobe/environments/stackhpc-baremetal/ansible/diagnose-baremetal.yml
Show resolved
Hide resolved
etc/kayobe/environments/stackhpc-baremetal/ansible/download-host-image.yml
Show resolved
Hide resolved
| @@ -0,0 +1,135 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we assuming this as a global configuration on the conductor? Or do we want to configure this via group_vars and apply it to nodes during enrollment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, lets move this into ensure-redfish playbook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where in the ensure-redfish playbook should this go?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to baremetal-0, baremetal-0 becomes an enroll-baremetal playbook
| extra_args: "{% if pip_upper_constraints_file %}-c {{ pip_upper_constraints_file }}{% endif %}" | ||
|
|
||
| - name: Ensure overcloud baremetal nodes are registered in ironic | ||
| hosts: baremetal-overcloud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its probably worth a quick comment here to note that this is because "kayobe baremetal compute register" doesn't currently touch the overcloud nodes, but we want them here.
etc/kayobe/environments/stackhpc-baremetal/ansible/baremetal-1-check-bmc-up.yml
Show resolved
Hide resolved
| # ansible.builtin.debug: | ||
| # msg: "{{ firmware_inventory.redfish_facts.firmware | to_nice_json }}" | ||
|
|
||
| - name: Reboot BMC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above, I think we need to remove this for now, until we have a generic version. Its possible we could add this as an extra step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should add something here, like a TODO, saying lets add an optional BMC reboot into the flow here.
| @@ -0,0 +1,129 @@ | |||
| --- | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should maybe rename these steps as 10,20,30,40... to allow people to add things in the middle.
etc/kayobe/environments/stackhpc-baremetal/ansible/baremetal-1-check-bmc-up.yml
Show resolved
Hide resolved
etc/kayobe/environments/stackhpc-baremetal/ansible/baremetal-1-check-bmc-up.yml
Show resolved
Hide resolved
| ironic_provision_network: "provision-net" | ||
| ironic_provision_flavor: "example_resource_class" | ||
| ironic_provision_image: "overcloud-rocky-9-2025.1-20250930T144255" | ||
| ironic_provision_key_name: "stack" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I think this is where the stuff like --management-interface=redfish might want to live, although maybe we want a baremetal-redfish group more like we have in jack's proposal?
| libvirt_vms: | ||
| - state: present | ||
| name: "{{ inventory_hostname }}" | ||
| xml_file: "{{ sushy_directory }}/vbmc-node.xml.j2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to consider removing the XML file all together here.
Probably, we want to have a uuid that is either predictable, or better, discovered after the VM has been created, so we know what to tell sushi to look for.
But I think we need that rather than the large xml blob, for maintainability in the medium term.
| - name: Run Sushy Emulator playbook | ||
| ansible.builtin.shell: | | ||
| source "$VIRTUAL_ENV/bin/activate" | ||
| kayobe playbook run "$KAYOBE_CONFIG_PATH/environments/stackhpc-sushy-baremetal/ansible/sushy-emulator.yml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this split into two, so we can discover the VM uuid, then configure sushi with the uuid we get given?
| name: devel | ||
| state: enabled | ||
|
|
||
| - name: Install package dependencies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't kayobe already do a bunch of this virt setup, on the seed host, at least? This feels like duplication here.
I am OK if we need the duplication, but I think this worth a double check we can't reuse the "standard" kayobe way to add these.
| @@ -0,0 +1,6 @@ | |||
| <network connections='1'> | |||
| <name>vbmc-net</name> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this one? I think it can be removed now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete the whole file?
etc/kayobe/environments/stackhpc-sushy-baremetal/ansible/vbmc-node.xml.j2
Show resolved
Hide resolved
| <pool type='dir'> | ||
| <name>default</name> | ||
| <uuid>{{ 'default' | to_uuid }}</uuid> | ||
| <capacity unit='bytes'>68509761536</capacity> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like this should be configurable via a variable in some role default?
| @@ -0,0 +1,3 @@ | |||
| --- | |||
|
|
|||
| stackhpc_lvm_lv_var_size: 10g | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add a comment around why we need this here.
|
|
||
| - name: Undeploy baremetals in 'deploy failed' or 'error' state | ||
| ansible.builtin.command: | ||
| cmd: "{{ venv }}/bin/openstack baremetal node undeploy {{ inventory_hostname }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe lets move this into ./baremetal-4-clean.yml, at least for deploy failed.
| delay: 20 | ||
| timeout: 300 | ||
|
|
||
| - name: Check BMC back up again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove most of this script for now, while we get the minimum merged.
| {% for key, value in ironic_properties.items() %} | ||
| --property {{ key }}={{ value }} \ | ||
| {% endfor %} | ||
| --resource-class {{ ironic_resource_class }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to extend this to allow us to specify all the interfaces that match the driver e.g. boot-interface, inspect-interface, etc.
Annoyingly this means we should change this playbook to run over both compute nodes and baremetal nodes for now.
| redfish_system_id: "{{ ironic_redfish_system_id }}" | ||
| redfish_address: "{{ ironic_redfish_address }}" | ||
| redfish_username: "{{ ironic_redfish_username | default(omit) }}" | ||
| redfish_password: "{{ ironic_redfish_password | default(omit) }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This default omit isn't working, I think we need to do a very complicated dictionary merge expression, where we default the merge to either {} or {"redfish_username": "{{ ironic_redfish_username }}"} or something like that.
| {{ venv }}/bin/openstack baremetal node create \ | ||
| --name {{ inventory_hostname }} \ | ||
| --driver {{ ironic_driver }} \ | ||
| {% for key, value in ironic_driver_info.items() %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually maybe don't do this as a dictionary, so fix the omit.
| extra_args: "{% if pip_upper_constraints_file %}-c {{ pip_upper_constraints_file }}{% endif %}" | ||
|
|
||
| - name: Ensure overcloud baremetal nodes are registered in ironic | ||
| hosts: baremetal-overcloud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change host to baremetal
| --name {{ inventory_hostname }} \ | ||
| --driver {{ ironic_driver }} \ | ||
| {% for key, value in ironic_driver_info.items() %} | ||
| --driver-info {{ key }}={{ value }} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--driver-info refish_username = {{ ironic_redfish_username }}
... and same for password and resource id etc
| {% for key, value in ironic_properties.items() %} | ||
| --property {{ key }}={{ value }} \ | ||
| {% endfor %} | ||
| --resource-class {{ ironic_resource_class }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add for other variable
{% if ironic_management_interface %}
--management-interface {{ ironic_management_interface }}
{% endif %}
and then ironic_management_interface defined in group_vars/baremetal-redfish/ironic
No description provided.