Skip to content

Commit 72e75a3

Browse files
testsean-smith
authored andcommitted
If merged this commit doesn the following:
* setup git dir in /workspace/llama31 * Remove subpath from pretrain_llama.py * Install toml package * Adjust --gres=gpu:8 to number of user specified devices Signed-off-by: Sean Smith <seasmith@nvidia.com>
1 parent 751d033 commit 72e75a3

File tree

3 files changed

+6
-2
lines changed

3 files changed

+6
-2
lines changed

large_language_model_pretraining/nemo/Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ FROM ${NEMO_BASE_IMAGE} AS nemo-base-image
1818
RUN pip uninstall transformers -y
1919
RUN pip install transformers==4.47.1 blobfile==3.0.0
2020
RUN pip install prettytable==3.12.0
21+
RUN pip install toml==0.10.2
2122
RUN pip install git+https://github.com/mlcommons/logging.git@4.1.0-rc3
2223

2324
# setup workspace

large_language_model_pretraining/nemo/pretrain_llama31.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,8 +75,8 @@ def slurm_executor(
7575
gpus_per_node=devices,
7676
mem="0",
7777
exclusive=True,
78-
gres="gpu:8",
79-
packager=run.GitArchivePackager(subpath="large_language_model_pretraining/nemo", ref="HEAD"),
78+
gres=f"gpu:{devices}",
79+
packager=run.GitArchivePackager(),
8080
dependencies=dependencies,
8181
)
8282

large_language_model_pretraining/nemo/run_llama31.sh

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@
1616

1717
set -e
1818

19+
git init
20+
git add .
21+
git commit -sm "First commit"
1922
git config --global --add safe.directory /workspace/llama31
2023

2124
# Vars without defaults

0 commit comments

Comments
 (0)