xFormers memory-efficient attention not supported on CPU 

I’m trying to run Depth-Anything-V2 with xFormers on my system (CPU only).  
I get the following error:

NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
     query       : shape=(1, 1531, 6, 64) (torch.float32)
     key         : shape=(1, 1531, 6, 64) (torch.float32)
     value       : shape=(1, 1531, 6, 64) (torch.float32)
     attn_bias   : <class 'NoneType'>
     p           : 0.0
`fa3F@2.8.3-133-gde1584b` is not supported because:
    device=cpu (supported: {'cuda'})
    dtype=torch.float32 (supported: {torch.bfloat16, torch.float16})

It seems that memory-efficient attention in xFormers requires CUDA.  

**My environment:**
- PyTorch 2.5.7 / 2.8.3
- xFormers installed from pip
- Running on CPU only (no GPU available)
- Python 3.10, Ubuntu 22.04

**Question:**  
Is there a way to run Depth-Anything-V2 on CPU without a GPU, or do I have to disable memory-efficient attention? How can I fix this error on CPU?

 **Below is My code**

#If this file give error of importing package then run it in this directory /testing_model/depth_models/src/Depth-Anything-V2

import cv2
import torch
import sys
sys.path.append('/home/wasiq/testing_model/depth_models/src/Depth-Anything-V2')
from depth_anything_v2.dpt import DepthAnythingV2

model_configs = {
    'vits': {'encoder': 'vits', 'features': 64, 'out_channels': [48, 96, 192, 384]},
    'vitb': {'encoder': 'vitb', 'features': 128, 'out_channels': [96, 192, 384, 768]},
    'vitl': {'encoder': 'vitl', 'features': 256, 'out_channels': [256, 512, 1024, 1024]}
}

encoder = 'vitb' # or 'vits', 'vitb'
dataset = 'hypersim' # 'hypersim' for indoor model, 'vkitti' for outdoor model
max_depth = 20 # 20 for indoor model, 80 for outdoor model

# model = DepthAnythingV2(**{**model_configs[encoder], 'max_depth': max_depth})
model = DepthAnythingV2(**{**model_configs[encoder]})
model.load_state_dict(torch.load(f'/home/wasiq/testing_model/depth_models/src/Depth-Anything-V2/metric_depth/checkpoints/depth_anything_v2_metric_{dataset}_{encoder}.pth', map_location='cpu'))
model.eval()

raw_img = cv2.imread('your/image/path')
depth = model.infer_image(raw_img) # HxW depth map in meters in numpy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xFormers memory-efficient attention not supported on CPU #310

model = DepthAnythingV2({model_configs[encoder], 'max_depth': max_depth})

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

xFormers memory-efficient attention not supported on CPU #310

Description

model = DepthAnythingV2(**{**model_configs[encoder], 'max_depth': max_depth})

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

model = DepthAnythingV2({model_configs[encoder], 'max_depth': max_depth})