1- # Guide for using the template with the CSCS Todi cluster
1+ # Guide for using the template with the CSCS Clariden cluster
22
33## Overview
44
55At this point, you should have edited the environment files and are ready to build or run the image.
6- This guide will show you how to build and run your image on the CSCS Todi cluster and use it for
6+ This guide will show you how to build and run your image on the CSCS Clariden cluster and use it for
77
881 . Remote development.
992 . Running unattended jobs.
@@ -44,8 +44,14 @@ pip install podman-compose
4444All commands should be run from the ` installation/docker-amd64-cuda/ ` directory.
4545
4646You should be on a compute node. If not already, get one.
47- ```
48- srun --partition=debug --time=0:30:00 --pty bash
47+ ``` bash
48+ # Request a compute node
49+ sbatch -N --time 4:00:00 -A a-a10 --wrap " sleep infinity" --output=/dev/null --error=/dev/null
50+ # Connect to it
51+ srun --overlap --pty --jobid=GET_THE_JOB_ID bash
52+ tmux
53+ # or if reconnecting
54+ tmux at
4955```
5056
5157``` bash
@@ -95,7 +101,7 @@ cd installation/docker-amd64-cuda
95101
96102** CSCS and Slurm** :
97103
98- 1 . You should have access to the Todi cluster.
104+ 1 . You should have access to the Clariden cluster.
991052 . You should have some knowledge of Slurm.
100106
101107There is a great documentation provided by the SwissAI initiative [ here] ( https://github.com/swiss-ai/documentation ) .
@@ -129,7 +135,7 @@ This guide includes the steps to do it, and there are general details in `data/R
129135
130136``` bash
131137# SSH to a cluster.
132- ssh todi
138+ ssh clariden
133139cd $SCRATCH
134140# Clone the repo twice with name dev and run (if you already have one, mv it to a different name)
135141mkdir template-project-name
@@ -163,12 +169,12 @@ They will be in `./EPFL-SCITAS-setup/submit-scripts`.
163169
164170Adapt the ` submit-scripts/minimal.sh ` with the name of your image and your cluster storage setup.
165171
166- The submission script gives an example of how to run containers on Todi with [ ` enroot ` ] ( https://github.com/NVIDIA/enroo )
172+ The submission script gives an example of how to run containers on Clariden with [ ` enroot ` ] ( https://github.com/NVIDIA/enroo )
167173and the [ ` pyxis ` ] ( https://github.com/NVIDIA/pyxis ) plugin directly integrated in ` srun ` ,
168174
169175Run the script to see how the template works.
170176``` bash
171- cd installation/docker-amd64-cuda//CSCS-Todi -setup/submit-scripts
177+ cd installation/docker-amd64-cuda//CSCS-Clariden -setup/submit-scripts
172178bash minimal.sh
173179```
174180
@@ -256,24 +262,24 @@ GitHub provides a guide for that
256262Use the following configuration in your local ` ~/.ssh/config `
257263
258264``` bash
259- Host todi
260- HostName todi .cscs.ch
265+ Host clariden
266+ HostName clariden .cscs.ch
261267 User smoalla
262268 ProxyJump ela
263269 ForwardAgent yes
264270
265271# EDIT THIS HOSTNAME WITH EVERY NEW JOB
266- Host todi -job
272+ Host clariden -job
267273 HostName nid005105
268274 User smoalla
269- ProxyJump todi
275+ ProxyJump clariden
270276 StrictHostKeyChecking no
271277 UserKnownHostsFile=/dev/null
272278 ForwardAgent yes
273279
274- Host todi -container
280+ Host clariden -container
275281 HostName localhost
276- ProxyJump todi -job
282+ ProxyJump clariden -job
277283 Port 2223
278284 User smoalla
279285 StrictHostKeyChecking no
@@ -286,7 +292,7 @@ of the host [(ref)](https://linuxcommando.blogspot.com/2008/10/how-to-disable-ss
286292which keeps changing every time a job is scheduled,
287293so that you don't have to reset it each time.
288294
289- With this config you can then connect to your container with ` ssh todi -container ` .
295+ With this config you can then connect to your container with ` ssh clariden -container ` .
290296
291297** Limitations**
292298
0 commit comments