-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Describe the bug
So I have been trying to build rocm/Megatron-LM from the source. But I encountered many issues, and the critical one is that it fails to build the 3rdparty/aiter's fused_attn for MI250. The transformer_engine/common/ck_fused_attn/CMakeLists.txt states only supports gfx942 and gfx950.
Steps/Code to reproduce bug
git clone --recursive https://github.com/ROCm/TransformerEngine && cd TransformerEngine && pip install --no-build-isolation . && cd ..
Error messages:
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/recipe.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/recipe_hip.h [skipped, already hipified]
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/permutation.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/permutation_hip.h [skipped, already hipified]
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/transpose.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/transpose_hip.h [skipped, already hipified]
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/padding.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/padding_hip.h [skipped, already hipified]
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/comm_gemm_overlap.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/comm_gemm_overlap_hip.h [skipped, already hipified]
/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/softmax.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/softmax_hip.h [skipped, already hipified]
�[92mSuccessfully preprocessed all matching files.�[0m
Total number of unsupported CUDA function calls: 0
Total number of replaced kernel launches: 339
-- Writing tmp_BPmmK9FVZXORyj9F into file - file_tmp_BPmmK9FVZXORyj9F.txt
-------------------------------------------------------------
-- nvte hipified sources: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transformer_engine_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/common.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/adam.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/compute_scale.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/l2norm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/scale.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/sgd.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/cast_transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/cast_transpose_fusion.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/transpose_fusion.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/multi_cast_transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/swap_first_dims.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/gelu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/dropout/dropout.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/flash_attn.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/context_parallel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/kv_cache.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/relu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/swiglu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/gemm/cublaslt_gemm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/common_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_api_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_bwd_semi_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_fwd_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_api_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_bwd_semi_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_fwd_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/permutation/permutation.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/cast.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/padding.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/hip_driver.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/hip_runtime.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/multi_stream_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/rtc_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/swizzle/swizzle.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_upper_triang_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_aligned_causal_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_rope/fused_rope.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_moe_aux_loss.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_score_for_moe_aux_loss.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_topk_with_score_function.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/current_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/delayed_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/fp8_block_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_aotriton_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_ck_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/utils_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/gemm/rocm_gemm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/amd_detail/system.cpp
-- Building AOTriton from source.
-- No-image mode: ON.
-- Adding AOTriton library.
-- Downloading AOTriton GPU Kernels.
CMake Warning at ck_fused_attn/CMakeLists.txt:26 (message):
gfx90a not supported with aiter v3 asm kernels
-- AITER V3_ASM_ARCHS:
-- [AITER-PREBUILT] Building aiter from source.
[AITER-PREBUILT] --aiter-dir, --aiter-test-dir, and --gpu-archs are required.
-- [AITER-PREBUILT] Caching locally built libs to /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../build/aiter-prebuilts/rocm-6.4_aiter-a64fa18e60235994e4cbfd7059cc2f60d06e743f
CMake Error at ck_fused_attn/aiter_prebuilt.cmake:47 (file):
file COPY cannot find
"/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/op_tests/cpp/mha/libmha_fwd.so":
No such file or directory.
Call Stack (most recent call first):
ck_fused_attn/CMakeLists.txt:61 (cache_local_aiter_build)
-- [AITER-PREBUILT] Using __AITER_MHA_PATH=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../build/aiter-prebuilts/rocm-6.4_aiter-a64fa18e60235994e4cbfd7059cc2f60d06e743f
-- Found the following fused attention files:
-- src/ck_fused_attn_fwd.cpp
-- src/ck_fused_attn_bwd.cpp
-- src/ck_fused_attn_utils.cpp
-- ck_include_dir: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/3rdparty/composable_kernel/include
-- aiter_include_dir: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/csrc/include
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
CMake Deprecation Warning at /opt/rocm-6.4.2/lib/cmake/hiprtc/hiprtc-config.cmake:21 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
Call Stack (most recent call first):
CMakeLists.txt:347 (find_package)
-- Configuring incomplete, errors occurred!
Building CMake extension transformer_engine
Running command /usr/bin/cmake -S /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common -B /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake -DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12 -DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312 -DUSE_ROCM=ON -DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3 -Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11 -GNinja
Traceback (most recent call last):
File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 94, in _build_cmake
subprocess.run(command, cwd=build_dir, check=True)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/cmake', '-S', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common', '-B', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12', '-DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312', '-DUSE_ROCM=ON', '-DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3', '-Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
main()
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
json_out["return_val"] = hook(**hook_input["kwargs"])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 280, in build_wheel
return _build_backend().build_wheel(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 410, in build_wheel
return self._build_with_temp_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir
self.run_setup()
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 487, in run_setup
super().run_setup(setup_script=setup_script)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup
exec(code, locals())
File "<string>", line 226, in <module>
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/__init__.py", line 104, in setup
return distutils.core.setup(**attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 184, in setup
return run_commands(dist)
^^^^^^^^^^^^^^^^^^
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
dist.run_commands()
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "<string>", line 64, in run
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/wheel/_bdist_wheel.py", line 390, in run
self.run_command("build")
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/command/build.py", line 132, in run
self.run_command(cmd_name)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
self.distribution.run_command(command)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
super().run_command(command)
File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 126, in run
ext._build_cmake(
File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 96, in _build_cmake
raise RuntimeError(f"Error when running CMake: {e}")
RuntimeError: Error when running CMake: Command '['/usr/bin/cmake', '-S', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common', '-B', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12', '-DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312', '-DUSE_ROCM=ON', '-DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3', '-Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
error: subprocess-exited-with-error
× Building wheel for transformer_engine (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> No available output.
note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12 /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /lustre/orion/gen150/scratch/zixianw4/tmp/tmpvvp6bo57
cwd: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine
Building wheel for transformer_engine (pyproject.toml): finished with status 'error'
ERROR: Failed building wheel for transformer_engine
Failed to build transformer_engine
error: failed-wheel-build-for-install
× Failed to build installable wheels for some pyproject.toml based projects
╰─> transformer_engine
A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.
Expected behavior
Doesn't work
Environment overview (please complete the following information)
I converted Megatron's docker file into executed bash commands:
conda create -p /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9 python=3.12 -y
conda activate /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9
# --- 2. INSTALL PYTORCH & DEPENDENCIES ---
echo "--- Installing PyTorch and PIP packages ---"
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.4
pip3 install scipy transformers einops flask-restful nltk pytest pytest-cov pytest_mock pytest-csv pytest-random-order sentencepiece wrapt zarr wandb tensorstore==0.1.45 pybind11 setuptools==69.5.1 datasets tiktoken pynvml "huggingface_hub[cli]"
python3 -m nltk.downloader punkt_tab
# Transformer Engine
echo "--- Building Transformer Engine ---"
export NVTE_FRAMEWORK=pytorch
export NVTE_ROCM_ARCH=gfx90a
export NVTE_USE_HIPBLASLT=1
git clone --recursive https://github.com/ROCm/TransformerEngine && cd TransformerEngine && pip install --no-build-isolation . && cd ..
- Environment location: [Bare-metal, Docker, Cloud(specify cloud provider - AWS, Azure, GCP, Collab)]
- Method of Transformer Engine install: [pip install or from source]. Please specify exact commands you used to install.
- If method of install is [Docker], provide
docker pull&docker runcommands used
Environment details
I am using rocm/6.4.2 on Frontier.
pip list
Package Version
------------------- ----------------------------
aiohappyeyeballs 2.6.1
aiohttp 3.13.3
aiosignal 1.4.0
amd-aiter 0.1.7.post6.dev50+ga64fa18e6
aniso8601 10.0.1
annotated-types 0.7.0
anyio 4.12.1
attrs 25.4.0
blinker 1.9.0
certifi 2026.1.4
charset-normalizer 3.4.4
click 8.3.1
contourpy 1.3.3
coverage 7.13.2
cycler 0.12.1
datasets 4.5.0
dill 0.4.0
donfig 0.8.1.post1
einops 0.8.2
filelock 3.20.0
Flask 3.1.2
Flask-RESTful 0.3.10
fonttools 4.61.1
frozenlist 1.8.0
fsspec 2025.10.0
gitdb 4.0.12
GitPython 3.1.46
google-crc32c 1.8.0
h11 0.16.0
hf-xet 1.2.0
httpcore 1.0.9
httpx 0.28.1
huggingface_hub 1.3.5
idna 3.11
iniconfig 2.3.0
iris 0.0.0.post286+g905ec1cea
itsdangerous 2.2.0
Jinja2 3.1.6
joblib 1.5.3
kiwisolver 1.4.9
MarkupSafe 2.1.5
matplotlib 3.10.8
mpmath 1.3.0
multidict 6.7.1
multiprocess 0.70.18
networkx 3.6.1
ninja 1.13.0
nltk 3.9.2
numcodecs 0.16.5
numpy 2.3.5
nvidia-ml-py 13.590.48
packaging 26.0
pandas 3.0.0
pillow 12.0.0
pip 26.0
platformdirs 4.5.1
pluggy 1.6.0
propcache 0.4.1
protobuf 6.33.5
psutil 7.2.2
pyarrow 23.0.0
pybind11 3.0.1
pydantic 2.12.5
pydantic_core 2.41.5
Pygments 2.19.2
pynvml 13.0.1
pyparsing 3.3.2
pytest 9.0.2
pytest-cov 7.0.0
pytest-csv 3.0.0
pytest-mock 3.15.1
pytest-random-order 1.2.0
python-dateutil 2.9.0.post0
pytorch-triton-rocm 3.5.1
pytz 2025.2
PyYAML 6.0.3
regex 2026.1.15
requests 2.32.5
ruff 0.14.14
safetensors 0.7.0
scipy 1.17.0
sentencepiece 0.2.1
sentry-sdk 2.51.0
setuptools 69.5.1
shellingham 1.5.4
six 1.17.0
smmap 5.0.2
sympy 1.14.0
tensorstore 0.1.45
tiktoken 0.12.0
tokenizers 0.22.2
torch 2.9.1+rocm6.4
torchvision 0.24.1+rocm6.4
tqdm 4.67.2
transformers 5.0.0
typer-slim 0.21.1
typing_extensions 4.15.0
typing-inspection 0.4.2
urllib3 2.6.3
wandb 0.24.1
Werkzeug 3.1.5
wheel 0.46.3
wrapt 2.1.0
xxhash 3.6.0
yarl 1.22.0
zarr 3.1.5
Device details
MI250
Additional context
Add any other context about the problem here.