Skip to content

Conversation

@jkoritzinsky
Copy link
Member

@jkoritzinsky jkoritzinsky commented Jul 17, 2025

Use the shared managed wait subsystem for CoreCLR's managed code instead of the Win32 PAL

Also, remove the named mutex support from the CoreCLR PAL as well as Mutex support.

Unblocks #115685

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

@jkoritzinsky jkoritzinsky added this to the 11.0.0 milestone Jul 17, 2025
@jkoritzinsky jkoritzinsky added the blocked Issue/PR is blocked on something - see comments label Jul 17, 2025
@jkoritzinsky jkoritzinsky force-pushed the coreclr-managed-wait branch from 25446d0 to 63dc9af Compare July 21, 2025 19:44
@jkoritzinsky jkoritzinsky force-pushed the coreclr-managed-wait branch 3 times, most recently from 06de3b2 to 63a1aa6 Compare July 31, 2025 16:44
@jkotas
Copy link
Member

jkotas commented Dec 5, 2025

I think the main remaining piece is perf testing.

@jkoritzinsky
Copy link
Member Author

@MihuBot benchmark System.Collections.Concurrent

@jkoritzinsky
Copy link
Member Author

@MihuBot benchmark System.Threading

@MihuBot
Copy link

MihuBot commented Dec 5, 2025

System.Collections.Concurrent.IsEmpty_String_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-UATSTI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-YKZQKY : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
Dictionary Main 0 65.9281 ns 0.0684 ns 1.00 - NA
Dictionary PR 0 66.1361 ns 0.1147 ns 1.00 - NA
Queue Main 0 1.7718 ns 0.0108 ns 1.00 - NA
Queue PR 0 1.7450 ns 0.0111 ns 0.98 - NA
Stack Main 0 0.0000 ns 0.0000 ns ? - ?
Stack PR 0 0.0016 ns 0.0023 ns ? - ?
Bag Main 0 6.4287 ns 0.0098 ns 1.00 - NA
Bag PR 0 6.3692 ns 0.0087 ns 0.99 - NA
Dictionary Main 512 2.9051 ns 0.0023 ns 1.00 - NA
Dictionary PR 512 2.9092 ns 0.0102 ns 1.00 - NA
Queue Main 512 1.2462 ns 0.0063 ns 1.00 - NA
Queue PR 512 1.2480 ns 0.0070 ns 1.00 - NA
Stack Main 512 0.0000 ns 0.0000 ns ? - ?
Stack PR 512 0.0005 ns 0.0004 ns ? - ?
Bag Main 512 5.7079 ns 0.0130 ns 1.00 - NA
Bag PR 512 5.8231 ns 0.0089 ns 1.02 - NA
System.Collections.Concurrent.IsEmpty_Int32_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-UATSTI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-YKZQKY : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
Dictionary Main 0 66.5415 ns 0.1328 ns 1.00 - NA
Dictionary PR 0 65.9239 ns 0.1428 ns 0.99 - NA
Queue Main 0 1.7547 ns 0.0114 ns 1.00 - NA
Queue PR 0 1.7366 ns 0.0111 ns 0.99 - NA
Stack Main 0 0.0016 ns 0.0006 ns 1.15 - NA
Stack PR 0 0.0006 ns 0.0007 ns 0.44 - NA
Bag Main 0 3.7171 ns 0.0085 ns 1.00 - NA
Bag PR 0 3.9912 ns 0.0134 ns 1.07 - NA
Dictionary Main 512 2.9416 ns 0.0026 ns 1.00 - NA
Dictionary PR 512 2.9062 ns 0.0057 ns 0.99 - NA
Queue Main 512 1.2843 ns 0.0057 ns 1.00 - NA
Queue PR 512 1.2502 ns 0.0045 ns 0.97 - NA
Stack Main 512 0.0000 ns 0.0000 ns ? - ?
Stack PR 512 0.0004 ns 0.0004 ns ? - ?
Bag Main 512 3.0545 ns 0.0107 ns 1.00 - NA
Bag PR 512 3.0413 ns 0.0096 ns 1.00 - NA
System.Collections.Concurrent.Count_String_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-UATSTI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-YKZQKY : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
Dictionary Main 512 65.953 ns 0.1158 ns 1.00 - NA
Dictionary PR 512 64.920 ns 0.1756 ns 0.98 - NA
Queue Main 512 2.467 ns 0.0237 ns 1.00 - NA
Queue PR 512 2.464 ns 0.0075 ns 1.00 - NA
Queue_EnqueueCountDequeue Main 512 13.757 ns 0.0478 ns 1.00 - NA
Queue_EnqueueCountDequeue PR 512 14.674 ns 0.0275 ns 1.07 - NA
Stack Main 512 566.391 ns 0.1360 ns 1.00 - NA
Stack PR 512 566.208 ns 0.0894 ns 1.00 - NA
Bag Main 512 17.084 ns 0.0182 ns 1.00 - NA
Bag PR 512 18.998 ns 0.0321 ns 1.11 - NA
System.Collections.Concurrent.Count_Int32_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-UATSTI : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-YKZQKY : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
Dictionary Main 512 65.527 ns 0.1690 ns 1.00 - NA
Dictionary PR 512 65.280 ns 0.0769 ns 1.00 - NA
Queue Main 512 2.365 ns 0.0143 ns 1.00 - NA
Queue PR 512 2.368 ns 0.0112 ns 1.00 - NA
Queue_EnqueueCountDequeue Main 512 11.204 ns 0.0417 ns 1.00 - NA
Queue_EnqueueCountDequeue PR 512 12.085 ns 0.0496 ns 1.08 - NA
Stack Main 512 565.480 ns 0.0676 ns 1.00 - NA
Stack PR 512 566.410 ns 0.2543 ns 1.00 - NA
Bag Main 512 17.552 ns 0.0492 ns 1.00 - NA
Bag PR 512 17.112 ns 0.0326 ns 0.97 - NA
System.Collections.Concurrent.AddRemoveFromSameThreads_String_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-ZRSSBP : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-DVTPLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
ConcurrentBag Main 2000000 233.8 ms 8.11 ms 1.00 1.23 KB 1.00
ConcurrentBag PR 2000000 231.8 ms 9.77 ms 0.99 1.49 KB 1.22
ConcurrentStack Main 2000000 108.3 ms 5.33 ms 1.00 125000.76 KB 1.00
ConcurrentStack PR 2000000 108.0 ms 3.28 ms 1.00 125000.66 KB 1.00
ConcurrentQueue Main 2000000 358.5 ms 8.95 ms 1.00 129.37 KB 1.00
ConcurrentQueue PR 2000000 353.1 ms 13.82 ms 0.99 129.74 KB 1.00
System.Collections.Concurrent.AddRemoveFromSameThreads_Int32_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-ZRSSBP : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-DVTPLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
ConcurrentBag Main 2000000 167.48 ms 15.191 ms 1.01 1000 B 1.00
ConcurrentBag PR 2000000 171.39 ms 9.409 ms 1.03 1240 B 1.24
ConcurrentStack Main 2000000 73.30 ms 4.012 ms 1.00 128000440 B 1.00
ConcurrentStack PR 2000000 78.24 ms 6.173 ms 1.07 128000992 B 1.00
ConcurrentQueue Main 2000000 344.87 ms 12.842 ms 1.00 34144 B 1.00
ConcurrentQueue PR 2000000 338.91 ms 14.231 ms 0.98 34352 B 1.01
System.Collections.Concurrent.AddRemoveFromDifferentThreads_String_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-ZRSSBP : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-DVTPLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
ConcurrentBag Main 2000000 178.17 ms 21.943 ms 1.02 32 MB 1.00
ConcurrentBag PR 2000000 182.69 ms 20.464 ms 1.05 32 MB 1.00
ConcurrentStack Main 2000000 60.75 ms 9.392 ms 1.03 61.04 MB 1.00
ConcurrentStack PR 2000000 56.43 ms 10.640 ms 0.96 61.04 MB 1.00
ConcurrentQueue Main 2000000 40.87 ms 18.326 ms 1.35 8 MB 1.00
ConcurrentQueue PR 2000000 37.52 ms 14.245 ms 1.24 8 MB 1.00
System.Collections.Concurrent.AddRemoveFromDifferentThreads_Int32_
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-ZRSSBP : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-DVTPLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  InvocationCount=1
IterationTime=250ms  MaxIterationCount=20  MaxWarmupIterationCount=10
MemoryRandomization=Default  MinIterationCount=15  MinWarmupIterationCount=6
UnrollFactor=1  WarmupCount=-1
Method Toolchain Size Mean Error Ratio Allocated Alloc Ratio
ConcurrentBag Main 2000000 169.86 ms 27.710 ms 1.04 16 MB 1.00
ConcurrentBag PR 2000000 168.26 ms 28.765 ms 1.03 16 MB 1.00
ConcurrentStack Main 2000000 55.38 ms 8.712 ms 1.03 61.04 MB 1.00
ConcurrentStack PR 2000000 51.83 ms 6.964 ms 0.96 61.04 MB 1.00
ConcurrentQueue Main 2000000 33.04 ms 12.851 ms 1.23 1 MB 1.00
ConcurrentQueue PR 2000000 34.53 ms 13.761 ms 1.29 1 MB 1.00

@MihuBot
Copy link

MihuBot commented Dec 5, 2025

System.Threading.Tests.Perf_Volatile
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-LIYULD : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BIOQYX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
Write_double Main 0.0011 ns 0.0004 ns 1.14 - NA
Write_double PR 0.0170 ns 0.0002 ns 17.78 - NA
Read_double Main 0.0000 ns 0.0000 ns ? - ?
Read_double PR 0.0000 ns 0.0000 ns ? - ?
System.Threading.Tests.Perf_Timer
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-LIYULD : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BIOQYX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
ShortScheduleAndDispose Main 81.69 ns 1.196 ns 1.00 120 B 1.00
ShortScheduleAndDispose PR 79.88 ns 1.156 ns 0.98 120 B 1.00
LongScheduleAndDispose Main 82.33 ns 0.966 ns 1.00 120 B 1.00
LongScheduleAndDispose PR 77.80 ns 0.852 ns 0.95 120 B 1.00
ScheduleManyThenDisposeMany Main 251,127,725.50 ns 4,960,544.997 ns 1.00 144000792 B 1.00
ScheduleManyThenDisposeMany PR 250,680,018.25 ns 4,756,751.471 ns 1.00 144001080 B 1.00
ShortScheduleAndDisposeWithFiringTimers Main 92.33 ns 3.218 ns 1.00 144 B 1.00
ShortScheduleAndDisposeWithFiringTimers PR 91.46 ns 3.550 ns 0.99 144 B 1.00
SynchronousContention Main 1,207,031,754.20 ns 9,747,908.067 ns 1.00 1152000840 B 1.00
SynchronousContention PR 1,199,816,590.07 ns 13,010,584.181 ns 0.99 1152001464 B 1.00
AsynchronousContention Main 1,274,214,338.07 ns 10,389,910.435 ns 1.00 1152002936 B 1.00
AsynchronousContention PR 1,269,298,031.00 ns 3,989,103.185 ns 1.00 1152035336 B 1.00
System.Threading.Tests.Perf_ThreadStatic
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
GetThreadStatic Main 1.380 ns 0.0007 ns 1.00 - NA
GetThreadStatic PR 1.309 ns 0.0818 ns 0.95 - NA
SetThreadStatic Main 3.022 ns 0.0023 ns 1.00 - NA
SetThreadStatic PR 3.010 ns 0.0116 ns 1.00 - NA
System.Threading.Tests.Perf_ThreadPool
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1  RatioSD=0.02  Gen0=38000.0000
Method Toolchain WorkItemsPerCore Mean Error Ratio Allocated Alloc Ratio
QueueUserWorkItem_WaitCallback_Throughput Main 20000000 2.072 s 0.0270 s 1.00 614.35 MB 1.00
QueueUserWorkItem_WaitCallback_Throughput PR 20000000 2.067 s 0.0413 s 1.00 610.35 MB 0.99
System.Threading.Tests.Perf_Thread
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-LIYULD : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BIOQYX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
CurrentThread Main 1.624 ns 0.0821 ns 1.00 - NA
CurrentThread PR 1.925 ns 0.1394 ns 1.19 - NA
GetCurrentProcessorId Main 1.910 ns 0.0023 ns 1.00 - NA
GetCurrentProcessorId PR 1.908 ns 0.0005 ns 1.00 - NA
System.Threading.Tests.Perf_SpinLock
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
EnterExit Main 2.988 ns 0.0020 ns 1.00 - NA
EnterExit PR 2.991 ns 0.0034 ns 1.00 - NA
TryEnterExit Main 2.987 ns 0.0030 ns 1.00 - NA
TryEnterExit PR 2.993 ns 0.0049 ns 1.00 - NA
TryEnter_Fail Main 1.008 ns 0.0006 ns 1.00 - NA
TryEnter_Fail PR 1.012 ns 0.0029 ns 1.00 - NA
System.Threading.Tests.Perf_SemaphoreSlim
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
ReleaseWait Main 22.59 ns 0.063 ns 1.00 - NA
ReleaseWait PR 26.95 ns 0.073 ns 1.19 - NA
ReleaseWaitAsync Main 21.71 ns 0.065 ns 1.00 - NA
ReleaseWaitAsync PR 23.09 ns 0.024 ns 1.06 - NA
ReleaseWaitAsync_WithCancellationToken Main 885.80 ns 17.162 ns 1.00 376 B 1.00
ReleaseWaitAsync_WithCancellationToken PR 841.77 ns 16.039 ns 0.95 376 B 1.00
ReleaseWaitAsync_WithTimeout Main 894.43 ns 19.792 ns 1.00 472 B 1.00
ReleaseWaitAsync_WithTimeout PR 865.09 ns 10.040 ns 0.97 472 B 1.00
ReleaseWaitAsync_WithCancellationTokenAndTimeout Main 928.95 ns 18.425 ns 1.00 472 B 1.00
ReleaseWaitAsync_WithCancellationTokenAndTimeout PR 926.33 ns 15.993 ns 1.00 472 B 1.00
System.Threading.Tests.Perf_Monitor
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
EnterExit Main 6.637 ns 0.0100 ns 1.00 - NA
EnterExit PR 6.664 ns 0.0174 ns 1.00 - NA
TryEnterExit Main 6.684 ns 0.0197 ns 1.00 - NA
TryEnterExit PR 6.830 ns 0.0209 ns 1.02 - NA
System.Threading.Tests.Perf_Lock
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
ReaderWriterLockSlimPerf Main 10.83 ns 0.025 ns 1.00 - NA
ReaderWriterLockSlimPerf PR 10.58 ns 0.033 ns 0.98 - NA
System.Threading.Tests.Perf_Interlocked
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
Increment_int Main 0.6309 ns 0.0022 ns 1.00 - NA
Increment_int PR 0.6316 ns 0.0018 ns 1.00 - NA
Decrement_int Main 0.6326 ns 0.0030 ns 1.00 - NA
Decrement_int PR 0.6318 ns 0.0013 ns 1.00 - NA
Increment_long Main 0.6371 ns 0.0012 ns 1.00 - NA
Increment_long PR 0.6361 ns 0.0011 ns 1.00 - NA
Decrement_long Main 0.6318 ns 0.0017 ns 1.00 - NA
Decrement_long PR 0.6312 ns 0.0019 ns 1.00 - NA
Add_int Main 0.6332 ns 0.0012 ns 1.00 - NA
Add_int PR 0.6337 ns 0.0019 ns 1.00 - NA
Add_long Main 0.6316 ns 0.0016 ns 1.00 - NA
Add_long PR 0.6337 ns 0.0023 ns 1.00 - NA
Exchange_int Main 0.7084 ns 0.0005 ns 1.00 - NA
Exchange_int PR 0.7069 ns 0.0011 ns 1.00 - NA
Exchange_long Main 0.7117 ns 0.0027 ns 1.00 - NA
Exchange_long PR 0.7127 ns 0.0033 ns 1.00 - NA
CompareExchange_int Main 0.9242 ns 0.0009 ns 1.00 - NA
CompareExchange_int PR 0.9242 ns 0.0006 ns 1.00 - NA
CompareExchange_long Main 0.9246 ns 0.0007 ns 1.00 - NA
CompareExchange_long PR 0.9259 ns 0.0006 ns 1.00 - NA
CompareExchange_object_Match Main 0.8082 ns 0.0159 ns 1.00 - NA
CompareExchange_object_Match PR 0.5963 ns 0.0213 ns 0.74 - NA
CompareExchange_object_NoMatch Main 0.6798 ns 0.1542 ns 1.06 - NA
CompareExchange_object_NoMatch PR 0.6951 ns 0.1080 ns 1.08 - NA
System.Threading.Tests.Perf_EventWaitHandle
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
Set_Reset Main 150.90 ns 0.419 ns 1.00 - NA
Set_Reset PR 18.53 ns 0.028 ns 0.12 - NA
System.Threading.Tests.Perf_CancellationToken
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-LIYULD : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-BIOQYX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MinIterationCount=15  WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
RegisterAndUnregister_Serial Main 21.938 ns 0.5815 ns 1.00 - NA
RegisterAndUnregister_Serial PR 22.223 ns 0.4732 ns 1.01 - NA
Cancel Main 43.773 ns 0.1376 ns 1.00 192 B 1.00
Cancel PR 43.745 ns 0.2647 ns 1.00 192 B 1.00
CreateLinkedTokenSource1 Main 25.538 ns 0.5531 ns 1.00 64 B 1.00
CreateLinkedTokenSource1 PR 26.917 ns 0.6143 ns 1.05 64 B 1.00
CreateLinkedTokenSource2 Main 43.592 ns 0.9936 ns 1.00 80 B 1.00
CreateLinkedTokenSource2 PR 43.460 ns 0.1365 ns 1.00 80 B 1.00
CreateLinkedTokenSource3 Main 68.244 ns 0.5427 ns 1.00 128 B 1.00
CreateLinkedTokenSource3 PR 73.360 ns 0.9278 ns 1.08 128 B 1.00
CreateTokenDispose Main 6.482 ns 0.1010 ns 1.00 48 B 1.00
CreateTokenDispose PR 6.064 ns 0.1107 ns 0.94 48 B 1.00
CreateRegisterDispose Main 38.460 ns 0.3876 ns 1.00 192 B 1.00
CreateRegisterDispose PR 38.213 ns 0.3140 ns 0.99 192 B 1.00
CreateManyRegisterDispose Main 13.181 ns 0.2233 ns 1.00 - NA
CreateManyRegisterDispose PR 12.777 ns 0.2279 ns 0.97 - NA
CreateManyRegisterMultipleDispose Main 94.063 ns 3.3137 ns 1.00 - NA
CreateManyRegisterMultipleDispose PR 95.822 ns 5.9894 ns 1.02 - NA
CancelAfter Main 56.551 ns 0.7401 ns 1.00 144 B 1.00
CancelAfter PR 58.721 ns 1.1196 ns 1.04 144 B 1.00
System.Threading.Tasks.Tests.Perf_AsyncMethods
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
EmptyAsyncMethodInvocation Main 5.176 ns 0.0156 ns 1.00 - NA
EmptyAsyncMethodInvocation PR 5.064 ns 0.0046 ns 0.98 - NA
SingleYieldMethodInvocation Main 427.007 ns 0.9847 ns 1.00 96 B 1.00
SingleYieldMethodInvocation PR 442.520 ns 8.5759 ns 1.04 96 B 1.00
Yield Main 229.001 ns 13.2847 ns 1.00 - NA
Yield PR 250.996 ns 11.8351 ns 1.10 - NA
System.Threading.Tasks.ValueTaskPerfTest
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-GWZVEA : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-JUROJA : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-IBSGRJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-EDFFRX : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=20
MaxWarmupIterationCount=10  MinIterationCount=15  MinWarmupIterationCount=2
WarmupCount=-1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
Await_FromResult Main 5.403 ns 0.0128 ns 1.00 - NA
Await_FromResult PR 5.345 ns 0.0171 ns 0.99 - NA
Await_FromCompletedTask Main 11.896 ns 0.2817 ns 1.00 72 B 1.00
Await_FromCompletedTask PR 11.020 ns 0.0829 ns 0.93 72 B 1.00
Await_FromCompletedValueTaskSource Main 17.035 ns 0.0827 ns 1.00 72 B 1.00
Await_FromCompletedValueTaskSource PR 16.764 ns 0.1131 ns 0.98 72 B 1.00
CreateAndAwait_FromResult Main 5.390 ns 0.0084 ns 1.00 - NA
CreateAndAwait_FromResult PR 5.325 ns 0.0084 ns 0.99 - NA
CreateAndAwait_FromResult_ConfigureAwait Main 5.441 ns 0.0046 ns 1.00 - NA
CreateAndAwait_FromResult_ConfigureAwait PR 5.341 ns 0.0051 ns 0.98 - NA
CreateAndAwait_FromCompletedTask Main 6.995 ns 0.0141 ns 1.00 - NA
CreateAndAwait_FromCompletedTask PR 6.972 ns 0.0153 ns 1.00 - NA
CreateAndAwait_FromCompletedTask_ConfigureAwait Main 7.013 ns 0.0122 ns 1.00 - NA
CreateAndAwait_FromCompletedTask_ConfigureAwait PR 7.014 ns 0.0142 ns 1.00 - NA
CreateAndAwait_FromCompletedValueTaskSource Main 8.360 ns 0.0079 ns 1.00 - NA
CreateAndAwait_FromCompletedValueTaskSource PR 8.461 ns 0.0142 ns 1.01 - NA
CreateAndAwait_FromYieldingAsyncMethod Main 686.236 ns 2.6644 ns 1.00 205 B 1.00
CreateAndAwait_FromYieldingAsyncMethod PR 701.221 ns 5.4459 ns 1.02 204 B 1.00
CreateAndAwait_FromDelayedTCS Main 88.674 ns 0.5013 ns 1.00 216 B 1.00
CreateAndAwait_FromDelayedTCS PR 85.375 ns 0.4450 ns 0.96 216 B 1.00
Copy_PassAsArgumentAndReturn_FromResult Main 1.771 ns 0.0023 ns 1.00 - NA
Copy_PassAsArgumentAndReturn_FromResult PR 1.774 ns 0.0034 ns 1.00 - NA
Copy_PassAsArgumentAndReturn_FromTask Main 2.621 ns 0.0113 ns 1.00 - NA
Copy_PassAsArgumentAndReturn_FromTask PR 2.617 ns 0.0040 ns 1.00 - NA
Copy_PassAsArgumentAndReturn_FromValueTaskSource Main 3.816 ns 0.0050 ns 1.00 - NA
Copy_PassAsArgumentAndReturn_FromValueTaskSource PR 3.807 ns 0.0064 ns 1.00 - NA
CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait Main 8.407 ns 0.0109 ns 1.00 - NA
CreateAndAwait_FromCompletedValueTaskSource_ConfigureAwait PR 8.339 ns 0.0581 ns 0.99 - NA
System.Threading.Channels.Tests.UnboundedChannelPerfTests
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
TryWriteThenTryRead Main 20.20 ns 0.032 ns 1.00 - NA
TryWriteThenTryRead PR 20.37 ns 0.040 ns 1.01 - NA
WriteAsyncThenReadAsync Main 24.98 ns 0.080 ns 1.00 - NA
WriteAsyncThenReadAsync PR 25.16 ns 0.041 ns 1.01 - NA
ReadAsyncThenWriteAsync Main 44.69 ns 0.110 ns 1.00 - NA
ReadAsyncThenWriteAsync PR 45.06 ns 0.031 ns 1.01 - NA
PingPong Main 10,937,232.08 ns 341,728.096 ns 1.00 1124 B 1.00
PingPong PR 10,928,424.51 ns 210,097.492 ns 1.00 1111 B 0.99
System.Threading.Channels.Tests.SpscUnboundedChannelPerfTests
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
TryWriteThenTryRead Main 21.20 ns 0.027 ns 1.00 - NA
TryWriteThenTryRead PR 20.68 ns 0.049 ns 0.98 - NA
WriteAsyncThenReadAsync Main 33.72 ns 0.030 ns 1.00 - NA
WriteAsyncThenReadAsync PR 31.13 ns 0.204 ns 0.92 - NA
ReadAsyncThenWriteAsync Main 41.79 ns 0.066 ns 1.00 - NA
ReadAsyncThenWriteAsync PR 41.96 ns 0.231 ns 1.00 - NA
PingPong Main 10,424,822.56 ns 310,016.007 ns 1.00 1131 B 1.00
PingPong PR 10,890,043.09 ns 498,719.080 ns 1.05 1199 B 1.06
System.Threading.Channels.Tests.BoundedChannelPerfTests
BenchmarkDotNet v0.14.1-nightly.20250107.205, Linux Ubuntu 22.04.5 LTS (Jammy Jellyfish)
AMD EPYC 9V74, 1 CPU, 8 logical and 4 physical cores
  Job-IXTPWL : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  Job-XGVMLJ : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
OutlierMode=Default  PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms
MaxIterationCount=20  MemoryRandomization=Default  MinIterationCount=15
WarmupCount=1
Method Toolchain Mean Error Ratio Allocated Alloc Ratio
TryWriteThenTryRead Main 29.56 ns 0.301 ns 1.00 - NA
TryWriteThenTryRead PR 29.31 ns 0.079 ns 0.99 - NA
WriteAsyncThenReadAsync Main 35.78 ns 0.085 ns 1.00 - NA
WriteAsyncThenReadAsync PR 35.67 ns 0.395 ns 1.00 - NA
ReadAsyncThenWriteAsync Main 40.58 ns 0.101 ns 1.00 - NA
ReadAsyncThenWriteAsync PR 40.19 ns 0.061 ns 0.99 - NA
PingPong Main 10,719,802.47 ns 336,447.929 ns 1.00 1108 B 1.00
PingPong PR 10,755,276.44 ns 90,025.478 ns 1.00 1108 B 1.00

@jkotas
Copy link
Member

jkotas commented Dec 6, 2025

Can we run TechEmpower too?

@jkoritzinsky
Copy link
Member Author

Do you know if we have a bot to run it? Otherwise I can run crank locally to validate on Monday.

Co-authored-by: Jan Kotas <jkotas@microsoft.com>

internal static void EnsureDetachedThreadCleanupThreadExists()
{
// We should only need to use a separate cleanup thread if we're on the finalizer thread.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still have the deadlock problem if the user code is waiting on a finalizer thread using hand-rolled synchronization (not one of ours) and the action that it is waiting for is blocked on observing an abandoned mutex?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering whether the dedicated cleanup thread is worth the trouble. I do not think we consider blocked finalizer thread to be something to be robust against in general. Blocking on a finalizer thread is a bug in the user code that can cause all sorts of problems.

Maybe we should keep it simple, wait if somebody will ever report this, and only consider fixing it then if the scenario looks plausible?

@MihaZupan
Copy link
Member

MihaZupan commented Dec 6, 2025

/benchmark plaintext,json aspnet-citrine-lin kestrel

That used to work :/

@jkotas
Copy link
Member

jkotas commented Dec 6, 2025

/benchmark plaintext,json,fortunes aspnet-citrine-lin runtime,libs

@jkotas
Copy link
Member

jkotas commented Dec 6, 2025

It used to work 2 weeks ago #121887 (comment)

@pr-benchmarks
Copy link

pr-benchmarks bot commented Dec 6, 2025

Benchmark started for plaintext, json, fortunes on aspnet-citrine-lin with runtime, libs. Logs: link

@pr-benchmarks
Copy link

pr-benchmarks bot commented Dec 6, 2025

An error occurred, please check the logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-System.Threading runtime-coreclr specific to the CoreCLR runtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants