feat: integrate RDMA support with MLX backend by localai-bot · Pull Request #1 · localai-bot/LocalAI

localai-bot · 2026-02-22T21:47:58Z

This PR adds RDMA support to the MLX backend in LocalAI, enabling distributed inference across Apple Silicon machines.

Summary of Changes

Backend Enhancements (`backend/python/mlx/backend.py`)

Added _initialize_rdma() method to initialize mlx.distributed when MLX_GRPC_SERVERS is set
RDMA workers are identified via environment variable (no hostfile needed, manual worker coordination via P2P)
Model loading and generation logic now supports collective operations when RDMA is active

CLI Worker Support (`core/cli/worker/worker_mlx.go`)

New mlx-rdma worker type for launching MLX backends in distributed mode
Follows the same pattern as llama.cpp workers (no P2P registration, uses MLX_GRPC_SERVERS)

P2P Integration (`core/cli/run.go`)

Sets MLX_GRPC_SERVERS environment variable alongside LLAMACPP_GRPC_SERVERS in TunnelCallback
Enables automatic worker discovery and IP collection for MLX RDMA workers

Design Decisions

No hostfile needed: Since workers are launched manually via local-ai worker mlx-rdma, we don't need automatic process spawning via mlx.launch --hostfile
Environment variable-based: Uses MLX_GRPC_SERVERS (same as LLAMACPP_GRPC_SERVERS) to pass worker IPs to the backend
P2P reuse: Leverages existing LocalAI P2P infrastructure for worker discovery, but backend handles RDMA coordination

Testing

Launch main instance with P2P enabled: local-ai run --p2p --token <token>
Launch workers on each node: local-ai worker mlx-rdma
Workers register via P2P; main instance sets MLX_GRPC_SERVERS env var
MLX backend initializes RDMA when MLX_GRPC_SERVERS is set

Notes

Requires mlx with JACCL backend support (mlx-jaccl-cluster integration)
Current implementation assumes all workers have identical model paths
Future work: Add model sharding via model.shard(mx.distributed.world_size())

- Add mlx_rdma_enabled environment variable check in MLX backend - Initialize mlx.distributed when MLX_GRPC_SERVERS is set - Add mlx-rdma worker type to CLI worker commands - Set MLX_GRPC_SERVERS env var alongside LLAMACPP_GRPC_SERVERS

feat: add MLX RDMA support with mlx.distributed

40c41d4

- Add mlx_rdma_enabled environment variable check in MLX backend - Initialize mlx.distributed when MLX_GRPC_SERVERS is set - Add mlx-rdma worker type to CLI worker commands - Set MLX_GRPC_SERVERS env var alongside LLAMACPP_GRPC_SERVERS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: integrate RDMA support with MLX backend#1

feat: integrate RDMA support with MLX backend#1
localai-bot wants to merge 1 commit intomasterfrom
feature/issue-8505

localai-bot commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

localai-bot commented Feb 22, 2026

Summary of Changes

Backend Enhancements (backend/python/mlx/backend.py)

CLI Worker Support (core/cli/worker/worker_mlx.go)

P2P Integration (core/cli/run.go)

Design Decisions

Testing

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Backend Enhancements (`backend/python/mlx/backend.py`)

CLI Worker Support (`core/cli/worker/worker_mlx.go`)

P2P Integration (`core/cli/run.go`)