Add interleaving to sgemm and dgemm. Disentangle trmm/symm from gemm. #5573

almayne · 2025-12-17T09:10:07Z

This change adds interleaving to sgemm and dgemm copies and kernels for ARMV8SVE.
This required a degree of disentangling symm and trmm kernels from gemm. It should now be much easier to apply further optimisations to gemm.

The addition of interleaving provides a ~1.4% speedup on c7g (V1), with negligible changes on c8g (V2).

Taken over square matrix operations with size 2->2014, stepsize = 1:
Geometric mean for interleave/c7g_dgemm.txt: 0.9859023206257058
Geometric mean for interleave/c7g_sgemm.txt: 0.9887890902680289
Geometric mean for interleave/c8g_dgemm.txt: 0.9970050554316875
Geometric mean for interleave/c8g_sgemm.txt: 0.9948135816755502

We see an increase in the sgemm speedup (~2.4%) on c7g for larger matrix sizes.

Taken over square matrix operations with size 2,000->10,000, stepsize = 1,000:
Geometric mean for 64thread_interleave/c7g_dgemm.txt: 0.9865252964543917
Geometric mean for 64thread_interleave/c7g_sgemm.txt: 0.9762227312411808
Geometric mean for 64thread_interleave/c8g_dgemm.txt: 0.9997186302044462
Geometric mean for 64thread_interleave/c8g_sgemm.txt: 0.9996022927667269

aditew01 · 2025-12-17T13:18:24Z

@Mousius can you please have a look?

…emm. Co-authored-by: Chris Sidebottom <chris.sidebottom@arm.com>

…ng of copyright notices added in last commit.

almayne and others added 4 commits December 18, 2025 14:42

Add interleaving to sgemm and dgemm. Disentangle trmm and symm from g…

f3c78c9

…emm. Co-authored-by: Chris Sidebottom <chris.sidebottom@arm.com>

Fixed builds and added missing copyright notices. Also fixed formatti…

651578e

…ng of copyright notices added in last commit.

Accommodate ex and quad precision builds.

959d3b3

Add new copy functions to ex and quad precision builds.

0a205ee

almayne force-pushed the sgemm_interleave branch from 8316dc1 to 0a205ee Compare December 18, 2025 15:26

Fix CMake build.

570adff

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add interleaving to sgemm and dgemm. Disentangle trmm/symm from gemm. #5573

Add interleaving to sgemm and dgemm. Disentangle trmm/symm from gemm. #5573

Uh oh!

almayne commented Dec 17, 2025

Uh oh!

aditew01 commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add interleaving to sgemm and dgemm. Disentangle trmm/symm from gemm. #5573

Are you sure you want to change the base?

Add interleaving to sgemm and dgemm. Disentangle trmm/symm from gemm. #5573

Uh oh!

Conversation

almayne commented Dec 17, 2025

Uh oh!

aditew01 commented Dec 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants