Skip to content

Generalize presto-nvl72 slurm scripts for any cluster#352

Open
misiugodfrey wants to merge 2 commits into
mainfrom
misiug/GeneralizeClusterScripts
Open

Generalize presto-nvl72 slurm scripts for any cluster#352
misiugodfrey wants to merge 2 commits into
mainfrom
misiug/GeneralizeClusterScripts

Conversation

@misiugodfrey
Copy link
Copy Markdown
Contributor

Summary

  • Make presto/slurm/presto-nvl72/ cluster-agnostic: all per-cluster values (partition, account, cpus-per-task, time limits, images, image paths, data root, etc.) now come from ~/.cluster_config.env. See cluster_config.env.example.
  • Introduce two shared libraries: launcher_common.sh (cluster-variant resolution, sbatch arg assembly, preflight checks, job monitoring) and slurm_common.sh (shared .slurm preamble). Eliminates ~80% of the per-launcher duplication.
  • Pre-flight checks in every launcher: missing image / data dir / analyzed metastore produce actionable errors pointing at the exact command to run next, before any sbatch submission.
  • Job state surfaced via sacct: launcher prints Job FAILED (state: …, exit: …) and always displays stderr, so silent failures don't masquerade as success.
  • launch-analyze-tables.sh is now always CPU (ANALYZE TABLE disables cudf regardless). New CLUSTER_DEFAULT_VARIANT setting lets CPU-only clusters drop the --cpu flag.
  • gen-data uses the same run_py_script.sh + miniforge3 pattern as analyze/benchmark instead of inventing its own pip flow.
  • Several CPU-variant fixes in functions.sh:run_worker (gate NVIDIA_VISIBLE_DEVICES export and GDS bind-mounts on VARIANT_TYPE=gpu so CPU nodes don't hit enroot hook failures).
  • README rewritten as a three-step workflow walkthrough; all 6 entry points now documented.

Test plan

  • ./launch-gen-data.sh -s 1 on CPU cluster → 240 MB of parquet under ${DATA}/tpch-rs-1
  • ./launch-analyze-tables.sh -s 1 on CPU cluster → tpchsf1/ populated in .hive_metastore/
  • ./launch-run.sh -n 1 -s 1 --cpu on CPU cluster → all 22 TPC-H queries pass
  • ./launch-run.sh -n 1 -s 1 on GPU cluster (job currently pending in batch queue)
  • download image -> gen data -> analyze data -> run_benchmark for tpch sf1k on NVL72 GPU/CPU
  • Preflight failure modes: missing image / missing data / missing metastore each print correct actionable hint
  • Failure reporting: jobs that exit non-zero are surfaced as FAILED with stderr displayed

@misiugodfrey misiugodfrey requested a review from a team as a code owner May 22, 2026 03:52
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 22, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Comment on lines -114 to -122
# Use POSIX I/O instead of GDS
./launch-run.sh -n 8 -s 3000 \
-w presto-native-worker-gpu-v1 -c presto-coordinator-v1 \
--disable-gds

# Use nsys to profile query 5 and 6 for worker 2
./launch-run.sh -n 8 -s 3000 \
-w presto-native-worker-gpu-v1 -c presto-coordinator-v1 \
-p --nsys-worker-id 2 -q 5,6
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest keeping these two simple examples in Step 3 — Run benchmarks, minus the -w -c arguments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "test plan", I'd suggest checking if --disable-gds, -p, --nsys-worker-id, -q <query_list> continue to work on the GPU cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants