-
Notifications
You must be signed in to change notification settings - Fork 396
Lkcolocated test #1350
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lkcolocated test #1350
Conversation
d30e4bb to
272c430
Compare
|
This pull request has been automatically marked as stale because it has been inactive for 60 days. It will be closed in 7 days if no further activity occurs. If you would like to continue working on this, please remove the |
|
This pull request was closed because it has been inactive for more than 7 days since being marked as stale. Please feel free to reopen it if you would like to continue. |
`#### Different names for these images because they are in same repository #####
export NAME=axlearn-img
export COLOCATED_NAME=colocated-img
export CKPT_BUCKET_NAME=<>
axlearn gcp bundle --name=$NAME
--bundler_spec=allow_dirty=True
--bundler_type=artifactregistry
--bundler_spec=dockerfile=Dockerfile
--bundler_spec=image=tpu
--bundler_spec=target=tpu
--bundler_spec=colocated_image_required=True
--bundler_spec=colocated_image_name=$COLOCATED_NAME
axlearn gcp launch run --cluster=mlperf-v5p
--runner_name gke_tpu_pathways \
--name=$NAME \
--instance_type=tpu-v5p-32
--num_replicas=1 \
--bundler_spec=allow_dirty=True \
--bundler_type=artifactregistry
--bundler_spec=image=tpu \
--bundler_spec=dockerfile=Dockerfile
--bundler_spec=target=tpu
--colocated_image=$COLOCATED_NAME \
-- TPU_PREMAPPED_BUFFER_SIZE=34359738368 python3 test_benchmark.py --ckpt_path $CKPT_BUCKET_NAME
`