Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
3a3d6b7
Use lm-eval harness for INCLUDE and global MMLU
timurcarstensen Oct 13, 2025
a8104fc
Remove mypy pre-commit hook
timurcarstensen Oct 14, 2025
3e9a6b6
chore: remove tests
timurcarstensen Oct 19, 2025
1a0b16a
fix: lighteval integration
timurcarstensen Oct 19, 2025
f9c5bce
fix: lumi paths
timurcarstensen Oct 20, 2025
64287d4
fix: faster compression
timurcarstensen Oct 20, 2025
2674439
fix: faster compression
timurcarstensen Oct 20, 2025
10d4217
chore: remove unnecessary files
timurcarstensen Oct 20, 2025
e2c866a
fix: ruff formatting target version
timurcarstensen Oct 20, 2025
20f04e9
chore: restructure task-groups into groups and super-groups
timurcarstensen Oct 20, 2025
73e2377
feat: task-cache prototype
timurcarstensen Oct 20, 2025
f831fbc
feat: task super groups
timurcarstensen Oct 21, 2025
5fe62ee
task cache fix
timurcarstensen Oct 21, 2025
e816bfd
fix: task cache; moving data files to oellm/resources
timurcarstensen Oct 21, 2025
a97d92d
Update README.md
timurcarstensen Oct 21, 2025
c9db766
misc
timurcarstensen Oct 21, 2025
34d7723
Merge branch 'codex/add-oellm-multilingual-task-group' of https://git…
timurcarstensen Oct 21, 2025
10b26ff
temporarily adding AGENTS>md for development
timurcarstensen Oct 21, 2025
e8e3b38
fix: task caching for lighteval
timurcarstensen Oct 21, 2025
d8c8ed5
fix
timurcarstensen Oct 21, 2025
d37b532
fix: compression algorithm
timurcarstensen Oct 22, 2025
79ace47
fix: updated apptainer definitions to include correct uv install
timurcarstensen Oct 22, 2025
13e985c
fix: lighteval cli args
timurcarstensen Oct 22, 2025
c9160d5
feat: wrapper to suppress tqdm output
timurcarstensen Oct 22, 2025
ccf4c5a
misc
timurcarstensen Oct 22, 2025
97b3d69
fix: lighteval tool python version
timurcarstensen Oct 22, 2025
541d387
nltk setup
timurcarstensen Oct 22, 2025
006ab8d
nltk setup
timurcarstensen Oct 22, 2025
15bea15
fix: downloading nltk data for lighteval during container setup
timurcarstensen Oct 22, 2025
9c97d25
suppressing all tqdm progress bars
timurcarstensen Oct 22, 2025
f11d4a4
lighteval fixes
timurcarstensen Oct 22, 2025
096cbc0
misc
timurcarstensen Oct 22, 2025
6e888d7
feat: aya-expanse tasks
timurcarstensen Oct 22, 2025
9d87217
chore: schedule-eval logic cleanup
timurcarstensen Oct 22, 2025
4f9f8a8
feat: adding spinners
timurcarstensen Oct 22, 2025
fe067fa
chore: making pre-commit happy
timurcarstensen Oct 22, 2025
f552c96
misc
timurcarstensen Oct 22, 2025
9bbf5c1
fix: restrict model parallel
timurcarstensen Oct 22, 2025
1b81460
fix: result collection
timurcarstensen Oct 23, 2025
c3e0b41
fix: leonardo directory
timurcarstensen Oct 23, 2025
d510921
Merge branch 'main' into codex/add-oellm-multilingual-task-group
timurcarstensen Nov 7, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build-and-push-apptainer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:

- name: Build SIF from definition file
run: |
apptainer --verbose build --fakeroot eval_env-${{ matrix.image }}.sif apptainer/${{ matrix.image }}.def
apptainer --verbose build --mksquashfs-args="-comp gzip -Xcompression-level 1" --fakeroot eval_env-${{ matrix.image }}.sif apptainer/${{ matrix.image }}.def
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explanation: trades off space vs aws ec2 time


- name: Install Hugging Face Hub CLI
run: pip install --upgrade "huggingface_hub"
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v3
uses: astral-sh/setup-uv@v7
with:
version: "latest"

Expand All @@ -40,7 +40,7 @@ jobs:
- uses: actions/checkout@v4

- name: Install uv
uses: astral-sh/setup-uv@v3
uses: astral-sh/setup-uv@v7
with:
version: "latest"

Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@
**/*.egg-info
**/*.csv
**/uv.lock
**/task_map_cache.json
9 changes: 1 addition & 8 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v4.5.0
rev: v6.0.0
hooks:
- id: trailing-whitespace
- id: end-of-file-fixer
Expand All @@ -18,10 +18,3 @@ repos:
- id: ruff
args: [--fix, --exit-non-zero-on-fix]
- id: ruff-format

- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.8.0
hooks:
- id: mypy
additional_dependencies: [types-PyYAML]
args: [--ignore-missing-imports]
5 changes: 5 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
Rules:
- no try...Except unless absolutely necessary
- no unnecessary comments
- don't worry about tests
- if you need to run stuff, assume there is a .venv at the root of the project. you can also just use uv
10 changes: 7 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ A package for running OELLM CLI workflows across multiple HPC clusters using SLU
- Restart failed evaluations (e.g., due to node failures) ✅ `oellm collect-results ... --reschedule true`
- Interactive eval job/csv builder ✅ `oellm build-csv`
- Recursively resolve local paths: pass a directory containing models and their nested intermediate checkpoints, will eval all checkpoints
- Support default task groups (cf `oellm/task-groups.yaml`)
- Support default task groups (cf `oellm/resources/task-groups.yaml`)

## Planned workflows
- Sync and download evaluation results from all clusters via a shared data layer
Expand All @@ -21,7 +21,7 @@ A package for running OELLM CLI workflows across multiple HPC clusters using SLU

```bash
# Install the package
uv tool install --python 3.12 git+https://github.com/OpenEuroLLM/oellm-cli.git
uv tool install git+https://github.com/OpenEuroLLM/oellm-cli.git

# Run evaluations on multiple models and tasks
oellm schedule-eval \
Expand Down Expand Up @@ -50,6 +50,10 @@ This will launch an interactive workflow where you can:
- Configure n-shot settings
- Preview and save your evaluation configuration

The resulting CSV includes an additional `eval_suite` column that records which
evaluation framework (e.g., `lm_eval` or `lighteval`) should be used for each
task.

Otherwise you can also directly schedule using a CSV file:
```bash
oellm schedule-eval --eval_csv_path custom_evals.csv
Expand Down Expand Up @@ -104,7 +108,7 @@ The `oellm` package orchestrates distributed LLM evaluations through the followi

### 1. **Cluster Auto-Detection**
- Automatically detects the current HPC cluster based on hostname patterns
- Loads cluster-specific configurations from [`clusters.yaml`](oellm/clusters.yaml) including:
- Loads cluster-specific configurations from [`clusters.yaml`](oellm/resources/clusters.yaml) including:
- SLURM partition and account settings
- Shared storage paths for models, datasets, and results
- GPU allocation and queue limits
Expand Down
33 changes: 23 additions & 10 deletions apptainer/jureca.def
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,37 @@ Bootstrap: docker
From: nvcr.io/nvidia/pytorch:25.06-py3

%labels
Author multi-cluster-eval
Description Apptainer image for JURECA cluster (converted from dockerfile)
Author oellm-cli
Description Apptainer image for JURECA JSC cluster

%post
# 1. Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
echo 'export PATH=$HOME/.local/bin:$PATH' >> /etc/profile
# Install uv into a global bin
curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR=/usr/local/bin sh

# Make uv visible for subsequent commands during build
export PATH=/root/.local/bin:$PATH
# Put uv-installed tool shims in a global bin too
export UV_TOOL_BIN_DIR=/usr/local/bin
uv --version

# 2. Install Python dependencies
uv pip install --system --break-system-packages lm-eval \
"transformers<=4.53.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate

# Optional: keep tool envs under /opt to avoid $HOME
export UV_TOOL_DIR=/opt/uv-tools
uv tool install --python 3.12 "lighteval[multilingual] @ git+https://github.com/huggingface/lighteval.git@63424f4e795ecc577b90646381b374af3a627978"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this is to make lighteval happy

uv pip install --system --break-system-packages nltk
mkdir -p /opt/nltk_data
python - <<'PY'
import nltk
nltk.download('punkt', download_dir='/opt/nltk_data')
nltk.download('punkt_tab', download_dir='/opt/nltk_data')
PY

%environment
# Ensure uv is present inside the container runtime as well
export PATH=/root/.local/bin:$PATH
export PATH=/usr/local/bin:$PATH
export UV_TOOL_BIN_DIR=/usr/local/bin
export UV_TOOL_DIR=/opt/uv-tools
export NLTK_DATA=/opt/nltk_data


%runscript
exec bash "$@"
32 changes: 22 additions & 10 deletions apptainer/leonardo.def
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,36 @@ Bootstrap: docker
From: nvcr.io/nvidia/pytorch:25.06-py3

%labels
Author multi-cluster-eval
Description Apptainer image for Leonardo cluster (converted from dockerfile)
Author oellm-cli
Description Apptainer image for Leonardo cluster

%post
# 1. Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
echo 'export PATH=$HOME/.local/bin:$PATH' >> /etc/profile
# Install uv into a global bin
curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR=/usr/local/bin sh

# Make uv visible for subsequent commands during build
export PATH=/root/.local/bin:$PATH
# Put uv-installed tool shims in a global bin too
export UV_TOOL_BIN_DIR=/usr/local/bin
uv --version

# 2. Install Python dependencies
uv pip install --system --break-system-packages lm-eval \
"transformers<=4.53.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate

# Optional: keep tool envs under /opt to avoid $HOME
export UV_TOOL_DIR=/opt/uv-tools
uv tool install --python 3.12 "lighteval[multilingual] @ git+https://github.com/huggingface/lighteval.git@63424f4e795ecc577b90646381b374af3a627978"
uv pip install --system --break-system-packages nltk
mkdir -p /opt/nltk_data
python - <<'PY'
import nltk
nltk.download('punkt', download_dir='/opt/nltk_data')
nltk.download('punkt_tab', download_dir='/opt/nltk_data')
PY

%environment
# Ensure uv is present inside the container runtime as well
export PATH=/root/.local/bin:$PATH
export PATH=/usr/local/bin:$PATH
export UV_TOOL_BIN_DIR=/usr/local/bin
export UV_TOOL_DIR=/opt/uv-tools
export NLTK_DATA=/opt/nltk_data

%runscript
exec bash "$@"
32 changes: 22 additions & 10 deletions apptainer/lumi.def
Original file line number Diff line number Diff line change
Expand Up @@ -2,24 +2,36 @@ Bootstrap: docker
From: rocm/pytorch:rocm6.4.1_ubuntu24.04_py3.12_pytorch_release_2.7.1

%labels
Author multi-cluster-eval
Description Apptainer image for LUMI cluster (converted from dockerfile)
Author oellm-cli
Description Apptainer image for LUMI cluster

%post
# 1. Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh
echo 'export PATH=$HOME/.local/bin:$PATH' >> /etc/profile
# Install uv into a global bin
curl -LsSf https://astral.sh/uv/install.sh | env UV_INSTALL_DIR=/usr/local/bin sh

# Make uv visible for subsequent commands during build
export PATH=/root/.local/bin:$PATH
# Put uv-installed tool shims in a global bin too
export UV_TOOL_BIN_DIR=/usr/local/bin
uv --version

# 2. Install Python dependencies
uv pip install --system --break-system-packages lm-eval \
"transformers<=4.53.0" "datasets<4.0.0" wandb sentencepiece tiktoken accelerate

# Optional: keep tool envs under /opt to avoid $HOME
export UV_TOOL_DIR=/opt/uv-tools
uv tool install --python 3.12 "lighteval[multilingual] @ git+https://github.com/huggingface/lighteval.git@63424f4e795ecc577b90646381b374af3a627978"
uv pip install --system --break-system-packages nltk
mkdir -p /opt/nltk_data
python - <<'PY'
import nltk
nltk.download('punkt', download_dir='/opt/nltk_data')
nltk.download('punkt_tab', download_dir='/opt/nltk_data')
PY

%environment
# Ensure uv is present inside the container runtime as well
export PATH=/root/.local/bin:$PATH
export PATH=/usr/local/bin:$PATH
export UV_TOOL_BIN_DIR=/usr/local/bin
export UV_TOOL_DIR=/opt/uv-tools
export NLTK_DATA=/opt/nltk_data

%runscript
exec bash "$@"
Loading
Loading