Skip triton sdpa in gemma3 #190

Gasoonjia · 2025-12-04T21:28:18Z

we skip using triton sdpa when deploying gemma3 model into executorch cuda backend to prevent perf regression. Will revert the config once our triton/cuda sdpa kernel can uniformly beat decomposed sdpa kernel.

Also remove unnecessary conv1d_to_conv2d decomposition; it is already inside et.

This reverts commit 6938f59.

This reverts commit 10f8a6e.

Gasoonjia added 2 commits December 4, 2025 13:25

init

ab36935

init

6938f59

larryliu0820 approved these changes Dec 5, 2025

View reviewed changes

Gasoonjia added 4 commits December 5, 2025 16:00

lint fix

10f8a6e

Revert "init"

3966d21

This reverts commit 6938f59.

Revert "lint fix"

0fe0aa4

This reverts commit 10f8a6e.

lint fix - 2

55da6fa

larryliu0820 merged commit eeafd42 into huggingface:main Dec 8, 2025
63 of 83 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Skip triton sdpa in gemma3 #190

Skip triton sdpa in gemma3 #190

Uh oh!

Gasoonjia commented Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Skip triton sdpa in gemma3 #190

Skip triton sdpa in gemma3 #190

Uh oh!

Conversation

Gasoonjia commented Dec 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants