Skip to content

Fix deformation gradient performance in 3D#1220

Open
efaulhaber wants to merge 4 commits into
trixi-framework:mainfrom
efaulhaber:deformation-grad-3d
Open

Fix deformation gradient performance in 3D#1220
efaulhaber wants to merge 4 commits into
trixi-framework:mainfrom
efaulhaber:deformation-grad-3d

Conversation

@efaulhaber
Copy link
Copy Markdown
Member

@svchb and @copilot This should be mathematically identical, but please check carefully that this is indeed identical and I did not introduce an error here.

3D

Machine main This PR
A4500 26.396 ms 3.457 ms (7.64x)
Intel Xeon w9-3475X (x36) 67.368 ms 39.861 ms (1.69x)
H100 FP64 5.774 ms 1.358 ms (4.25x)
H100 FP32 3.030 ms 855.816 μs (3.54x)

2D

Machine main This PR
A4500 402.072 μs 395.481 μs (1.02x)
Intel Xeon w9-3475X (x36) 4.900 ms 4.855 ms (1.01x)
H100 FP64 261.475 μs 256.930 μs (1.02x)
H100 FP32 157.665 μs 155.842 μs (1.01x)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Optimizes the Total Lagrangian SPH (TLSPH) 3D deformation gradient accumulation by algebraically reordering the tensor-product/matrix multiplication to reduce computational cost while preserving the mathematical result.

Changes:

  • Rewrites the deformation gradient neighbor contribution from pos_diff * grad_kernel' * L_a' into an equivalent transposed form (L_a * grad_kernel * pos_diff')'.
  • Introduces a temporary F_T (the transposed contribution) to make the faster evaluation order explicit and readable.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@efaulhaber efaulhaber marked this pull request as ready for review May 27, 2026 11:55
@efaulhaber efaulhaber requested a review from svchb May 27, 2026 11:55
@codecov
Copy link
Copy Markdown

codecov Bot commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.04%. Comparing base (6bb102a) to head (e963431).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1220      +/-   ##
==========================================
- Coverage   90.05%   90.04%   -0.01%     
==========================================
  Files         136      136              
  Lines       10594    10596       +2     
==========================================
+ Hits         9540     9541       +1     
- Misses       1054     1055       +1     
Flag Coverage Δ
total 90.05% <100.00%> (+<0.01%) ⬆️
unit 70.77% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/schemes/structure/total_lagrangian_sph/system.jl Outdated
@efaulhaber efaulhaber requested a review from svchb May 29, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants