-
Notifications
You must be signed in to change notification settings - Fork 23
Added Dockerfile for CI images #195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
wenchenvincent
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address the comments.
docker/Dockerfile
Outdated
| RUN apt update \ | ||
| && apt install -y nano wget ninja-build \ | ||
| && apt install -y python3 python3-pip git \ | ||
| && apt install -y sqlite3 libsqlite3-dev libfmt-dev libmsgpack-dev libsuitesparse-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are those packages (sqlite3 and further) for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall aotriton need sqlite3 long time ago, not sure about now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have derived this dockerfile from this https://github.com/ROCm/DeepLearningModels/blob/main/docker/pyt_semianalysis_models.ubuntu.amd.Dockerfile,
Will remove these, if these are not required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@VeeraRajasekhar Those are needed for the semianalysis models. We don't need them for the CI images. We only need to keep those packages necessary for building TE and running CI tests.
docker/Dockerfile
Outdated
|
|
||
| RUN python3 -m pip install --upgrade pip | ||
| RUN pip install ninja cmake setuptools wheel | ||
| RUN pip install uv tabulate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are those for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is derived from https://github.com/ROCm/DeepLearningModels/blob/main/docker/pyt_semianalysis_models.ubuntu.amd.Dockerfile, I have removed these, will test the resultant docker after my changes
docker/Dockerfile
Outdated
| RUN apt update \ | ||
| && apt install -y nano wget ninja-build \ | ||
| && apt install -y python3 python3-pip git \ | ||
| && apt install -y sqlite3 libsqlite3-dev libfmt-dev libmsgpack-dev libsuitesparse-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recall aotriton need sqlite3 long time ago, not sure about now
ipanfilo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why conversations are marked as resolved w/o any actual action?
Some of them, I have resolved, some I have currently resolved in my local, just to keep track I will mark them resolved. |
|
@VeeraRajasekhar Is this PR still needed? |
|
@VeeraRajasekhar Could you remind me of what we had decided on this PR? It seemed that it is no longer relevant and we should close it. |
c4913b2 to
e49b365
Compare
|
Hi @ipanfilo, @wangye805 I have updated this PR with latest 7.2 docker file and moved to .github/scripts. Let me know if I need to add an action to automate docker build and upload to our artifactory? Thanks. |
.github/scripts/Dockerfile.ci.deps
Outdated
| RUN pip install ipython pytest fire pydantic pybind11 ninja pandas | ||
| RUN apt-get update && apt-get install -y vim | ||
|
|
||
| ARG PYTORCH_ROCM_ARCH=gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1200;gfx1201;gfx950;gfx1151 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this arg needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed, when I manually installed torch I mentioned this, I can remove this now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
.github/scripts/Dockerfile.ci.deps
Outdated
| @@ -0,0 +1,27 @@ | |||
| # TE CI Dockerfile | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put copyright here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
.github/scripts/Dockerfile.ci.deps
Outdated
|
|
||
| ARG PYTORCH_ROCM_ARCH=gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1200;gfx1201;gfx950;gfx1151 | ||
|
|
||
| # Install flash-attention v2.8.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better make FA version arg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
.github/scripts/Dockerfile.ci.deps
Outdated
|
|
||
|
|
||
| RUN pip install \ | ||
| https://repo.radeon.com/rocm/manylinux/rocm-rel-7.2/jax_rocm7_pjrt-0.8.0%2Brocm7.2.0-py3-none-manylinux_2_28_x86_64.whl \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since base image is ARG, JAX wheels should also be built based on ARG cause their ROCm and Python versions should match. So in general there should be ARGs for ROCm version, base image, JAX version, JAX wheels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
e49b365 to
bdc75a2
Compare
|
I had to force push to include new FA 2.8.3 support commit and my changes for 7.2 support to run the CI. Thanks. |
Description
Added the dockerfile, which can be used to create the ci-artifactory images.
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: