Create automated SLURM installation scripts tailored to a cluster configuration.
A tool that streamlines the installation of Slurm, a popular job scheduling and workload management system, on cluster environments.
It will accomplish this by leveraging a cluster description file, which contains crucial information about the cluster's architecture and configuration.
The installation scripts generated by this tool are designed to support a range of CentOS Linux versions, including CentOS 7, 8, and 9. This ensures flexibility and adaptability to various cluster environments running different CentOS releases.
- CentOS 7
- CentOS 8
- CentOS 9
Scripts have been tested on each of the above CentOS versions to ensure compatibility and smooth installation. However, it's essential to be aware of any CentOS-specific differences or updates that may affect your cluster configuration.
The cluster installation script generation process is designed to be versatile and compatible with Bash-compatible machines. This means you can easily generate installation scripts on a wide range of operating systems as long as they support Bash.
Software Dependency:
- Python
Python Package Dependencies:
- Jinja2
** If you plan to pre-build Slurm packages on a builder host for subsequent use on your cluster machines, it's essential to ensure that your host's operating system corresponds to that of the cluster machines.
To create an example installation package for running a script from the repository:
./build.sh
After the script has completed its execution, you will find the result in the ./build directory.
This directory needs to be distributed across your cluster hosts.
To initiate the installation process on your cluster nodes, simply execute the following script on the intended target hosts, which can be your cluster nodes or controllers:
./build/scripts/install.sh
Here's an example of a cluster description:
{
"Name": "examplecluster",
"SlurmVersion" : "20.11.9",
"Nodes": [
{
"Name": "master",
"IP": "192.168.1.1",
"Feature": "dcv2,other"
},
{
"Name": "worker2",
"CPUs": "16"
}
],
"Controller": {
"Name": "master",
"IP": "192.168.1.1"
}
}- 'Name': Cluster name
- 'SlurmVersion' : Target Slurm Version. Refer to https://download.schedmd.com/slurm/ to see available versions
- 'Nodes': List of computation nodes. Each node is an object with its own properties, representing a nested structure.
- 'Name': Node hostname
- 'CPUs': Number of CPUs for a node (optional)
- 'IP': IP address of the node (optional). For networks without DNS - mandatory
- 'Feature': A comma-delimited list of arbitrary strings indicative of some characteristic associated with the node [https://slurm.schedmd.com/slurm.conf.html] (optional)
- 'Controller': The Controller could be a computation node as well. Controller description:
- 'Name': Controller hostname
- 'IP': IP address of the controller (optional). For networks without DNS - mandatory
Artem - marisov.av@gmail.com