Skip to content

Build Failing for MI250 gfx90a #439

@zixianwang2022

Description

@zixianwang2022

Describe the bug

So I have been trying to build rocm/Megatron-LM from the source. But I encountered many issues, and the critical one is that it fails to build the 3rdparty/aiter's fused_attn for MI250. The transformer_engine/common/ck_fused_attn/CMakeLists.txt states only supports gfx942 and gfx950.

Steps/Code to reproduce bug

git clone --recursive https://github.com/ROCm/TransformerEngine && cd TransformerEngine && pip install --no-build-isolation . && cd ..

Error messages:


  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/recipe.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/recipe_hip.h [skipped, already hipified]
  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/permutation.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/permutation_hip.h [skipped, already hipified]
  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/transpose.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/transpose_hip.h [skipped, already hipified]
  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/padding.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/padding_hip.h [skipped, already hipified]
  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/comm_gemm_overlap.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/comm_gemm_overlap_hip.h [skipped, already hipified]
  /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/softmax.h -> /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/include/transformer_engine/softmax_hip.h [skipped, already hipified]
  �[92mSuccessfully preprocessed all matching files.�[0m
  Total number of unsupported CUDA function calls: 0


  Total number of replaced kernel launches: 339
  -- Writing tmp_BPmmK9FVZXORyj9F into file - file_tmp_BPmmK9FVZXORyj9F.txt
  -------------------------------------------------------------
  -- nvte hipified sources: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transformer_engine_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/common.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/adam.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/compute_scale.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/l2norm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/scale.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/multi_tensor/sgd.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/cast_transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/cast_transpose_fusion.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/transpose_fusion.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/multi_cast_transpose.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/transpose/swap_first_dims.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/gelu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/dropout/dropout.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/flash_attn.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/context_parallel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn/kv_cache.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/relu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/activation/swiglu.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/gemm/cublaslt_gemm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/common_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_api_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_bwd_semi_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/layernorm/ln_fwd_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_api_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_bwd_semi_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/normalization/rmsnorm/rmsnorm_fwd_hip_kernel.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/permutation/permutation.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/cast.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/padding.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/hip_driver.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/hip_runtime.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/multi_stream_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/util/rtc_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/swizzle/swizzle.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_upper_triang_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_softmax/scaled_aligned_causal_masked_softmax.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_rope/fused_rope.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_moe_aux_loss.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_score_for_moe_aux_loss.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_router/fused_topk_with_score_function.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/current_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/delayed_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/recipe/fp8_block_scaling.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_aotriton_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/fused_attn_ck_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/fused_attn_rocm/utils_hip.cpp;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/gemm/rocm_gemm.hip;/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/amd_detail/system.cpp
  -- Building AOTriton from source.
  -- No-image mode: ON.
  -- Adding AOTriton library.
  -- Downloading AOTriton GPU Kernels.
  CMake Warning at ck_fused_attn/CMakeLists.txt:26 (message):
    gfx90a not supported with aiter v3 asm kernels


  -- AITER V3_ASM_ARCHS:
  --  [AITER-PREBUILT] Building aiter from source.
  [AITER-PREBUILT] --aiter-dir, --aiter-test-dir, and --gpu-archs are required.
  -- [AITER-PREBUILT] Caching locally built libs to /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../build/aiter-prebuilts/rocm-6.4_aiter-a64fa18e60235994e4cbfd7059cc2f60d06e743f
  CMake Error at ck_fused_attn/aiter_prebuilt.cmake:47 (file):
    file COPY cannot find
    "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/op_tests/cpp/mha/libmha_fwd.so":
    No such file or directory.
  Call Stack (most recent call first):
    ck_fused_attn/CMakeLists.txt:61 (cache_local_aiter_build)


  -- [AITER-PREBUILT] Using __AITER_MHA_PATH=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../build/aiter-prebuilts/rocm-6.4_aiter-a64fa18e60235994e4cbfd7059cc2f60d06e743f
  -- Found the following fused attention files:
  --  src/ck_fused_attn_fwd.cpp
  --  src/ck_fused_attn_bwd.cpp
  --  src/ck_fused_attn_utils.cpp
  -- ck_include_dir: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/3rdparty/composable_kernel/include
  -- aiter_include_dir: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common/ck_fused_attn/../../../3rdparty/aiter/csrc/include
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  CMake Deprecation Warning at /opt/rocm-6.4.2/lib/cmake/hiprtc/hiprtc-config.cmake:21 (cmake_minimum_required):
    Compatibility with CMake < 3.5 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.
  Call Stack (most recent call first):
    CMakeLists.txt:347 (find_package)


  -- Configuring incomplete, errors occurred!
  Building CMake extension transformer_engine
  Running command /usr/bin/cmake -S /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common -B /lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake -DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12 -DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312 -DUSE_ROCM=ON -DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3 -Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11 -GNinja
  Traceback (most recent call last):
    File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 94, in _build_cmake
      subprocess.run(command, cwd=build_dir, check=True)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/subprocess.py", line 571, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['/usr/bin/cmake', '-S', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common', '-B', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12', '-DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312', '-DUSE_ROCM=ON', '-DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3', '-Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 389, in <module>
      main()
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 373, in main
      json_out["return_val"] = hook(**hook_input["kwargs"])
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 280, in build_wheel
      return _build_backend().build_wheel(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 410, in build_wheel
      return self._build_with_temp_dir(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 395, in _build_with_temp_dir
      self.run_setup()
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 487, in run_setup
      super().run_setup(setup_script=setup_script)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/build_meta.py", line 311, in run_setup
      exec(code, locals())
    File "<string>", line 226, in <module>
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
             ^^^^^^^^^^^^^^^^^^
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "<string>", line 64, in run
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/wheel/_bdist_wheel.py", line 390, in run
      self.run_command("build")
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 126, in run
      ext._build_cmake(
    File "/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build_tools/build_ext.py", line 96, in _build_cmake
      raise RuntimeError(f"Error when running CMake: {e}")
  RuntimeError: Error when running CMake: Command '['/usr/bin/cmake', '-S', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/transformer_engine/common', '-B', '/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/cmake', '-DPython_EXECUTABLE=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12', '-DPython_INCLUDE_DIR=/lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/include/python3.12', '-DCMAKE_BUILD_TYPE=Release', '-DCMAKE_INSTALL_PREFIX=/lustre/orion/gen150/scratch/zixianw4/TransformerEngine/build/lib.linux-x86_64-cpython-312', '-DUSE_ROCM=ON', '-DCK_FUSED_ATTN_FLOAT_TO_BFLOAT16_DEFAULT=3', '-Dpybind11_DIR=/lustre/orion/gen150/world-shared/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pybind11/share/cmake/pybind11', '-GNinja']' returned non-zero exit status 1.
  error: subprocess-exited-with-error
  
  × Building wheel for transformer_engine (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> No available output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  full command: /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/bin/python3.12 /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /lustre/orion/gen150/scratch/zixianw4/tmp/tmpvvp6bo57
  cwd: /lustre/orion/gen150/scratch/zixianw4/TransformerEngine
  Building wheel for transformer_engine (pyproject.toml): finished with status 'error'
  ERROR: Failed building wheel for transformer_engine
Failed to build transformer_engine
error: failed-wheel-build-for-install

× Failed to build installable wheels for some pyproject.toml based projects
╰─> transformer_engine


A helpful guide on on how to craft a minimal bug report http://matthewrocklin.com/blog/work/2018/02/28/minimal-bug-reports.

Expected behavior

Doesn't work

Environment overview (please complete the following information)

I converted Megatron's docker file into executed bash commands:

conda create -p /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9 python=3.12 -y

conda activate /lustre/orion/world-shared/gen150/zixianw4/envs/megatron-build-rocm6.4.2-torch2.9

# --- 2. INSTALL PYTORCH & DEPENDENCIES ---
echo "--- Installing PyTorch and PIP packages ---"
pip3 install torch torchvision --index-url https://download.pytorch.org/whl/rocm6.4

pip3 install scipy transformers einops flask-restful nltk pytest pytest-cov pytest_mock pytest-csv pytest-random-order sentencepiece wrapt zarr wandb tensorstore==0.1.45 pybind11 setuptools==69.5.1 datasets tiktoken pynvml "huggingface_hub[cli]"
python3 -m nltk.downloader punkt_tab


# Transformer Engine
echo "--- Building Transformer Engine ---"
export NVTE_FRAMEWORK=pytorch
export NVTE_ROCM_ARCH=gfx90a
export NVTE_USE_HIPBLASLT=1
git clone --recursive https://github.com/ROCm/TransformerEngine && cd TransformerEngine && pip install --no-build-isolation . && cd ..
  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider - AWS, Azure, GCP, Collab)]
  • Method of Transformer Engine install: [pip install or from source]. Please specify exact commands you used to install.
  • If method of install is [Docker], provide docker pull & docker run commands used

Environment details

I am using rocm/6.4.2 on Frontier.

pip list
Package             Version
------------------- ----------------------------
aiohappyeyeballs    2.6.1
aiohttp             3.13.3
aiosignal           1.4.0
amd-aiter           0.1.7.post6.dev50+ga64fa18e6
aniso8601           10.0.1
annotated-types     0.7.0
anyio               4.12.1
attrs               25.4.0
blinker             1.9.0
certifi             2026.1.4
charset-normalizer  3.4.4
click               8.3.1
contourpy           1.3.3
coverage            7.13.2
cycler              0.12.1
datasets            4.5.0
dill                0.4.0
donfig              0.8.1.post1
einops              0.8.2
filelock            3.20.0
Flask               3.1.2
Flask-RESTful       0.3.10
fonttools           4.61.1
frozenlist          1.8.0
fsspec              2025.10.0
gitdb               4.0.12
GitPython           3.1.46
google-crc32c       1.8.0
h11                 0.16.0
hf-xet              1.2.0
httpcore            1.0.9
httpx               0.28.1
huggingface_hub     1.3.5
idna                3.11
iniconfig           2.3.0
iris                0.0.0.post286+g905ec1cea
itsdangerous        2.2.0
Jinja2              3.1.6
joblib              1.5.3
kiwisolver          1.4.9
MarkupSafe          2.1.5
matplotlib          3.10.8
mpmath              1.3.0
multidict           6.7.1
multiprocess        0.70.18
networkx            3.6.1
ninja               1.13.0
nltk                3.9.2
numcodecs           0.16.5
numpy               2.3.5
nvidia-ml-py        13.590.48
packaging           26.0
pandas              3.0.0
pillow              12.0.0
pip                 26.0
platformdirs        4.5.1
pluggy              1.6.0
propcache           0.4.1
protobuf            6.33.5
psutil              7.2.2
pyarrow             23.0.0
pybind11            3.0.1
pydantic            2.12.5
pydantic_core       2.41.5
Pygments            2.19.2
pynvml              13.0.1
pyparsing           3.3.2
pytest              9.0.2
pytest-cov          7.0.0
pytest-csv          3.0.0
pytest-mock         3.15.1
pytest-random-order 1.2.0
python-dateutil     2.9.0.post0
pytorch-triton-rocm 3.5.1
pytz                2025.2
PyYAML              6.0.3
regex               2026.1.15
requests            2.32.5
ruff                0.14.14
safetensors         0.7.0
scipy               1.17.0
sentencepiece       0.2.1
sentry-sdk          2.51.0
setuptools          69.5.1
shellingham         1.5.4
six                 1.17.0
smmap               5.0.2
sympy               1.14.0
tensorstore         0.1.45
tiktoken            0.12.0
tokenizers          0.22.2
torch               2.9.1+rocm6.4
torchvision         0.24.1+rocm6.4
tqdm                4.67.2
transformers        5.0.0
typer-slim          0.21.1
typing_extensions   4.15.0
typing-inspection   0.4.2
urllib3             2.6.3
wandb               0.24.1
Werkzeug            3.1.5
wheel               0.46.3
wrapt               2.1.0
xxhash              3.6.0
yarl                1.22.0
zarr                3.1.5

Device details
MI250

Additional context

Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions