feat(rocm): Add AMD GPU support via rocm#133
feat(rocm): Add AMD GPU support via rocm#133steamwings wants to merge 3 commits intodavidamacey:masterfrom
Conversation
Add Docker build infrastructure and shell script support for running OpenTranscribe on AMD GPUs via ROCm/HIP. New files: - Dockerfile.rocm: Multi-stage build with PyTorch ROCm 6.4, CTranslate2 ROCm wheel, and MIOpen JIT compilation headers - requirements-rocm.txt: Python dependencies with ROCm-specific PyTorch - docker-compose.rocm-build.yml: Build overlay for ROCm backend image - docker-compose.gpu-rocm.yml: Runtime overlay with GPU device passthrough, render group mapping, and ROCm environment variables Modified files: - opentr.sh: Auto-detect ROCm GPUs and inject compose overlays - .env.example: Document HSA_OVERRIDE_GFX_VERSION and RENDER_GROUP_GID Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update hardware detection and GPU monitoring to handle AMD ROCm/HIP alongside NVIDIA CUDA: - hardware_detection.py: Add is_rocm property to detect HIP backend, skip NVIDIA-specific env vars (TORCH_CUDA_ARCH_LIST) on ROCm, report gpu_backend and hip_version in hardware summary, skip NVIDIA driver config in Docker runtime helper - utility.py: Use rocm-smi for GPU stats on ROCm (temperature, VRAM, utilization), with fallback to PyTorch CUDA API for memory stats Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
pyannote.audio v3 uses the deprecated use_auth_token parameter that was removed in huggingface-hub 1.0.0. Pin to <1.0.0 to prevent runtime errors during speaker diarization model loading. This affects the CUDA build as well (requirements.txt on master) but is kept separate here for easy cherry-pick reference. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@steamwings thank you for the PR, ROCm is on the long term plan. While I am not a regular ROCm I am looking forward to learning more. Thank you for doing the initial test to get this started! This will expand the user base! I will explore this more after the version 0.4.0 release. |
ROCm PR Review & Implementation Plan@steamwings — thank you so much for putting this together! This is a really solid foundation for AMD GPU support. The work you've done navigating the ROCm library compatibility maze (PyTorch ROCm 6.4 bundled libs + CTranslate2 ROCm 7.0 system libs, MIOpen JIT headers, HSA runtime symlinks) is impressive and clearly shows real hands-on experience with ROCm. The We've done a thorough review of the code and researched ROCm best practices for each component in our ML stack (WhisperX, PyAnnote, CTranslate2, Sentence Transformers). Based on your work and our research, we've put together a detailed phased implementation plan that we'll use to build on your foundation after the v0.4.0 release. Full Implementation Plan📋 OpenTranscribe ROCm Implementation Plan This covers:
Key Takeaways from ResearchThe good news is that all of our ML stack components work on ROCm:
We noted a few things we'll want to address when we pick this up:
We'll work through the plan after v0.4.0 and will keep this PR as the reference point. Really appreciate you getting the ball rolling on this — it's going to open up OpenTranscribe to a much wider audience of GPU users! |
Pull Request
Description
Add rocm support for AMD GPUs.
I don't really expect this to be merged, but I'm putting up the PR up for visibility anyway.
Limitations
Type of change
Changes made
Commit 1: feat(rocm): Add AMD ROCm GPU support infrastructure
CTranslate2 installed separately via Dockerfile.
(start, reset, rebuild-backend, build).
Commit 2: feat(rocm): Add ROCm awareness to Python backend
CUDA API for memory stats when rocm-smi is unavailable.
Commit 3: fix(deps): Pin huggingface-hub<1.0.0 for pyannote.audio compatibility
Testing
Frontend changes (if applicable)
Backend changes (if applicable)
Documentation
Screenshots (if applicable)
Add screenshots to help explain your changes
Additional notes
Any additional information that reviewers should know