Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
220 changes: 110 additions & 110 deletions doc/source/operations/gpu-in-openstack.rst
Original file line number Diff line number Diff line change
Expand Up @@ -198,39 +198,39 @@ Create a new playbook or update an existing on to apply the roles:
.. code-block:: yaml
:caption: $KAYOBE_CONFIG_PATH/ansible/host-configure.yml

---
- hosts: iommu
tags:
- iommu
tasks:
- import_role:
name: stackhpc.linux.iommu
handlers:
- name: reboot
set_fact:
kayobe_needs_reboot: true

- hosts: vgpu
tags:
- vgpu
tasks:
- import_role:
name: stackhpc.linux.vgpu
handlers:
- name: reboot
set_fact:
kayobe_needs_reboot: true

- name: Reboot when required
hosts: iommu:vgpu
tags:
- reboot
tasks:
- name: Reboot
reboot:
reboot_timeout: 3600
become: true
when: kayobe_needs_reboot | default(false) | bool
---
- hosts: iommu
tags:
- iommu
tasks:
- import_role:
name: stackhpc.linux.iommu
handlers:
- name: reboot
set_fact:
kayobe_needs_reboot: true

- hosts: vgpu
tags:
- vgpu
tasks:
- import_role:
name: stackhpc.linux.vgpu
handlers:
- name: reboot
set_fact:
kayobe_needs_reboot: true

- name: Reboot when required
hosts: iommu:vgpu
tags:
- reboot
tasks:
- name: Reboot
reboot:
reboot_timeout: 3600
become: true
when: kayobe_needs_reboot | default(false) | bool

Ansible Inventory Configuration
-------------------------------
Expand Down Expand Up @@ -281,41 +281,41 @@ Configure the VGPU devices:
.. code-block:: yaml
:caption: $KAYOBE_CONFIG_PATH/inventory/group_vars/compute_vgpu/vgpu

#nvidia-692 GRID A100D-4C
#nvidia-693 GRID A100D-8C
#nvidia-694 GRID A100D-10C
#nvidia-695 GRID A100D-16C
#nvidia-696 GRID A100D-20C
#nvidia-697 GRID A100D-40C
#nvidia-698 GRID A100D-80C
#nvidia-699 GRID A100D-1-10C
#nvidia-700 GRID A100D-2-20C
#nvidia-701 GRID A100D-3-40C
#nvidia-702 GRID A100D-4-40C
#nvidia-703 GRID A100D-7-80C
#nvidia-707 GRID A100D-1-10CME
vgpu_definitions:
# Configuring a MIG backed VGPU
- pci_address: "0000:17:00.0"
virtual_functions:
- mdev_type: nvidia-700
index: 0
- mdev_type: nvidia-700
index: 1
- mdev_type: nvidia-700
index: 2
- mdev_type: nvidia-699
index: 3
mig_devices:
"1g.10gb": 1
"2g.20gb": 3
# Configuring a card in a time-sliced configuration (non-MIG backed)
- pci_address: "0000:65:00.0"
virtual_functions:
- mdev_type: nvidia-697
index: 0
- mdev_type: nvidia-697
index: 1
#nvidia-692 GRID A100D-4C
#nvidia-693 GRID A100D-8C
#nvidia-694 GRID A100D-10C
#nvidia-695 GRID A100D-16C
#nvidia-696 GRID A100D-20C
#nvidia-697 GRID A100D-40C
#nvidia-698 GRID A100D-80C
#nvidia-699 GRID A100D-1-10C
#nvidia-700 GRID A100D-2-20C
#nvidia-701 GRID A100D-3-40C
#nvidia-702 GRID A100D-4-40C
#nvidia-703 GRID A100D-7-80C
#nvidia-707 GRID A100D-1-10CME
vgpu_definitions:
# Configuring a MIG backed VGPU
- pci_address: "0000:17:00.0"
virtual_functions:
- mdev_type: nvidia-700
index: 0
- mdev_type: nvidia-700
index: 1
- mdev_type: nvidia-700
index: 2
- mdev_type: nvidia-699
index: 3
mig_devices:
"1g.10gb": 1
"2g.20gb": 3
# Configuring a card in a time-sliced configuration (non-MIG backed)
- pci_address: "0000:65:00.0"
virtual_functions:
- mdev_type: nvidia-697
index: 0
- mdev_type: nvidia-697
index: 1

.. _NVIDIA Kolla Ansible Configuration:

Expand All @@ -330,34 +330,34 @@ Map through the kayobe inventory groups into kolla:
.. code-block:: yaml
:caption: $KAYOBE_CONFIG_PATH/kolla.yml

kolla_overcloud_inventory_top_level_group_map:
control:
groups:
- controllers
network:
groups:
- network
compute_cpu:
groups:
- compute_cpu
compute_gpu:
groups:
- compute_gpu
compute_multi_instance_gpu:
groups:
- compute_multi_instance_gpu
compute_vgpu:
groups:
- compute_vgpu
compute:
groups:
- compute
monitoring:
groups:
- monitoring
storage:
groups:
"{{ kolla_overcloud_inventory_storage_groups }}"
kolla_overcloud_inventory_top_level_group_map:
control:
groups:
- controllers
network:
groups:
- network
compute_cpu:
groups:
- compute_cpu
compute_gpu:
groups:
- compute_gpu
compute_multi_instance_gpu:
groups:
- compute_multi_instance_gpu
compute_vgpu:
groups:
- compute_vgpu
compute:
groups:
- compute
monitoring:
groups:
- monitoring
storage:
groups:
"{{ kolla_overcloud_inventory_storage_groups }}"

Where the ``compute_<suffix>`` groups have been added to the kayobe defaults.

Expand Down Expand Up @@ -413,21 +413,21 @@ Below is a snippet of openstack-config for defining a project, and a security gr
port_range_min: 7070
port_range_max: 7070

secgroup_nvidia_dls:
name: nvidia-dls
project: "{{ project_cloud_services.name }}"
rules: "{{ secgroup_rules_nvidia_dls }}"
secgroup_nvidia_dls:
name: nvidia-dls
project: "{{ project_cloud_services.name }}"
rules: "{{ secgroup_rules_nvidia_dls }}"

openstack_security_groups:
- "{{ secgroup_nvidia_dls }}"
openstack_security_groups:
- "{{ secgroup_nvidia_dls }}"

project_cloud_services:
name: "cloud-services"
description: "Internal Cloud services"
project_domain: default
user_domain: default
users: []
quotas: "{{ quotas_project }}"
project_cloud_services:
name: "cloud-services"
description: "Internal Cloud services"
project_domain: default
user_domain: default
users: []
quotas: "{{ quotas_project }}"

Booting the VM:

Expand Down Expand Up @@ -526,7 +526,7 @@ Disk image builder recipe to automatically license VGPU on boot
element to configure the nvidia-gridd service in VGPU mode. This allows you to boot VMs that automatically license themselves.
Snippets of ``openstack-config`` that allow you to do this are shown below:

.. code-block:: shell
.. code-block:: yaml

image_rocky9_nvidia:
name: "Rocky9-NVIDIA"
Expand Down
Loading