This project is a fork of : https://github.com/giovtorres/slurm-docker-cluster.
It uses LDMS : https://ovis-hpc.readthedocs.io/projects/ldms/en/latest/index.html
The goal was to create a basic cluster, and install LDMS made to gather metrics in a HPC environment.
Made during my intership at CEA.
docker build -t slurm-docker-cluster --network=host .
docker build -t logstash-with-opensearch-plugins --network=host logstash #building logstash
docker compose up -dDepending on your goal:
- To see how to use this project (manual usage), refer to the example-list below.
- For automatic launches and Grafana visualization, see the detailed workflow here.
.
├── docker-compose.yml
├── docker-entrypoint.sh
├── Dockerfile
├── ldms_conf
├── logstash
├── md
├── README.md
├── scripts
├── shared
└── slurm
Dockerfile,docker-compose.yml,docker-entrypoint.shthe main images isslurm-docker-cluster, it builds openmpi, slurm, pmix, ldms and so on.ldms_conf-> all the config file related to the ldms daemon, whether it be sampler or aggregator config are herelogstash-> logstash config file to get data from the Kafka broker (the LDMS aggregator can send data to the broker ifldms_conf/agg_kafka.confis used, see here)md-> all the file located below in the Example section are written herescripts-> shell script to create user scripts explanationshared-> a shared folder for all the node. The compute folders is only mounted on thecomputenode, .iec1,c2,c3. The folderdatais mounted as is in theslurmctldservice.
The mount is made using the docker compose volume method.
All the subfolders contains README.md
This is the main image used for this project. It uses the multi-stage build feature from docker.
An anchor is used for the compute node, even though the number of compute node is 3 for this whole project and config files were written to follow this logic.
# pwd : ldms-docker-cluster
docker build -t slurm-docker-cluster --network=host .
docker build -t logstash-with-opensearch-plugins --network=host logstash 