Skip to content

rrr attempt#5

Draft
adienes wants to merge 26 commits intomb_rrr_origfrom
mb_rrr
Draft

rrr attempt#5
adienes wants to merge 26 commits intomb_rrr_origfrom
mb_rrr

Conversation

@adienes
Copy link
Owner

@adienes adienes commented Aug 26, 2025

No description provided.

@adienes adienes force-pushed the mb_rrr branch 2 times, most recently from 9e0c9da to 9839ce6 Compare August 27, 2025 18:53
- Move reduce.jl include to after show modules for better bootstrap ordering
- Simplify nested @nexprs macro in multidimensional.jl
- Simplify error message for missing reduced_index implementation
- Add support for general UnitRange in reduced_index for OffsetArrays
- Remove redundant test for empty array dimensional reductions
- Add 16x unrolled kernel for small types (<=32 bits) in mapreduce_kernel_commutative
- Add RFMap type for better type inference in reduction kernels
- Add type-aware commutativity checks for multiplication operations
- Optimize pairwise blocksize for compute-heavy functions (sqrt, sin, cos, exp, log)
- Add MIN_BLOCK constant for minimum reduction block size
- Introduce _MRAllocSink stub for future sink API (minimal bootstrap support)
This is a major architectural refactor that replaces the _MapReduceAllocator
and _MapReduceInPlace system with a unified sink API.

Key changes:
- Replace _MapReduceAllocator with _MRSink abstract type and _MRAllocSink
- Add _MRInPlaceSink for in-place reductions with update support
- Implement sink API methods: mapreduce_allocate, mapreduce_set!, mapreduce_accum!
- Add streaming kernel _mapreducedim_stream_firstdim for "keep dim 1" case
- Refactor mapreducedim functions to use sink API throughout
- Update mapreduce! and reduce! to support update parameter
- Improve pairwise reduction with better chunking and type dispatch
- Add comprehensive handling of empty arrays and edge cases

Benefits:
- Unified API for both allocating and in-place reductions
- Better performance for streaming reductions
- More maintainable and extensible architecture
- Proper support for update semantics in reduce!
adienes added 22 commits August 27, 2025 15:44
- Remove unnecessary reduced_index(i::UnitRange) method (OffsetArrays handles its own types)
- Remove unused layout_trait function and non-existent type references
- Remove unused RFMap struct (was added but never constructed)
- Remove unused _to_sink helper functions
adienes pushed a commit that referenced this pull request Sep 15, 2025
Fixes
https://buildkite.com/julialang/julia-master/builds/46446#0195f712-1844-4e81-8b16-27b953fedcd3/899-1778

```
  | Error in testset Profile:
  | Test Failed at /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/stdlib/v1.13/Profile/test/runtests.jl:231
  | Expression: occursin("@Compiler" * slash, str)
  | Evaluated: occursin("@Compiler/", "Overhead ╎ [+additional indent] Count File:Line  Function\n=========================================================\n   ╎9   @juliasrc/task.c:1249  start_task\n   ╎ 9   @juliasrc/julia.h:2353  jl_apply\n   ╎  9   @juliasrc/gf.c:3693  ijl_apply_generic\n   ╎   9   @juliasrc/gf.c:3493  _jl_invoke\n   ╎    9   [unknown stackframe]\n   ╎     9   @Distributed/src/process_messages.jl:287  (::Distributed.var\"#handle_msg##2#handle_msg##3\"{Distributed.CallMsg{:call_fetch}, Distributed.MsgHeader, Sockets.TCPSocket})()\n   ╎    ╎ 9   @Distributed/src/process_messages.jl:70  run_work_thunk(thunk::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}}, print_error::Bool)\n   ╎    ╎  9   @Distributed/src/process_messages.jl:287  (::Distributed.var\"#handle_msg##4#handle_msg##5\"{Distributed.CallMsg{:call_fetch}})()\n   ╎    ╎   9   @juliasrc/builtins.c:841  jl_f__apply_iterate\n   ╎    ╎    9   @juliasrc/julia.h:2353  jl_apply\n   ╎    ╎     9   @juliasrc/gf.c:3693  ijl_apply_generic\n   ╎    ╎    ╎ 9   @juliasrc/gf.c:3493  _jl_invoke\n   ╎    ╎    ╎  9   @Base/Base_compiler.jl:223  kwcall(::@NamedTuple{seed::UInt128}, ::typeof(invokelatest), ::Function, ::String, ::Vararg{String})\n   ╎    ╎    ╎   9   @juliasrc/builtins.c:841  jl_f__apply_iterate\n   ╎    ╎    ╎    9   @juliasrc/julia.h:2353  jl_apply\n   ╎    ╎    ╎     9   @juliasrc/gf.c:3693  ijl_apply_generic\n   ╎    ╎    ╎    ╎ 9   @juliasrc/gf.c:3493  _jl_invoke\n   ╎    ╎    ╎    ╎  9   @juliasrc/builtins.c:853  jl_f_invokelatest\n   ╎    ╎    ╎    ╎   9   @juliasrc/julia.h:2353  jl_apply\n   ╎    ╎    ╎    ╎    9   @juliasrc/gf.c:3693  ijl_apply_generic\n   ╎    ╎    ╎    ╎     9   @juliasrc/gf.c:3493  _jl_invoke\n   ╎    ╎    ╎    ╎    ╎ 9   [unknown stackframe]\n   ╎    ╎    ╎    ╎    ╎  9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7  kwcall(::@NamedTuple{seed::UInt128}, ::typeof(runtests), name::String, path::String)\n   ╎    ╎    ╎    ╎    ╎   9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:7  runtests\n   ╎    ╎    ╎    ╎    ╎    9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:13  runtests(name::String, path::String, isolate::Bool; seed::UInt128)\n   ╎    ╎    ╎    ╎    ╎     9   @Base/env.jl:265  withenv(f::var\"#4#5\"{UInt128, String, String, Bool, Bool}, keyvals::Pair{String, Bool})\n   ╎    ╎    ╎    ╎    ╎    ╎ 9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:27  (::var\"#4#5\"{UInt128, String, String, Bool, Bool})()\n   ╎    ╎    ╎    ╎    ╎    ╎  9   @Base/timing.jl:621  macro expansion\n   ╎    ╎    ╎    ╎    ╎    ╎   9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:29  macro expansion\n   ╎    ╎    ╎    ╎    ╎    ╎    9   @Test/src/Test.jl:1835  macro expansion\n   ╎    ╎    ╎    ╎    ╎    ╎     9   /cache/build/tester-amdci5-10/julialang/julia-master/julia-7c9af464cc/share/julia/test/testdefs.jl:37  macro expansion\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   @Base/Base.jl:303  include(mod::Module, _path::String)\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   @Base/loading.jl:2925  _include(mapexpr::Function, mod::Module, _path::String)\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎   9   @juliasrc/gf.c:3693  ijl_apply_generic\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    9   @juliasrc/gf.c:3493  _jl_invoke\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎     9   @Base/loading.jl:2865  include_string(mapexpr::typeof(identity), mod::Module, code::String, filename::String)\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   @Base/boot.jl:489  eval(m::Module, e::Any)\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   @juliasrc/toplevel.c:1095  ijl_toplevel_eval_in\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9   @juliasrc/toplevel.c:1050  ijl_toplevel_eval\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9   @juliasrc/toplevel.c:978  jl_toplevel_eval_flex\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9   @juliasrc/toplevel.c:1038  jl_toplevel_eval_flex\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   @juliasrc/interpreter.c:897  jl_interpret_toplevel_thunk\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   @juliasrc/interpreter.c:557  eval_body\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9   @juliasrc/interpreter.c:557  eval_body\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    9   @juliasrc/interpreter.c:557  eval_body\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎     9   @juliasrc/interpreter.c:692  eval_body\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎ 9   @juliasrc/interpreter.c:193  eval_stmt_value\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎  9   @juliasrc/interpreter.c:242  eval_value\n   ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎    ╎   9   @juliasrc/interpreter.c:124  do_call\n   ╎    ╎    ╎    ╎
...
```
adienes pushed a commit that referenced this pull request Sep 15, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified
by tsan:

```
./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end'
==================
WARNING: ThreadSanitizer: data race (pid=5575)
  Write of size 4 at 0xffff9bf9bd28 by thread T9:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Previous write of size 4 at 0xffff9bf9bd28 by thread T10:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28)
```
adienes pushed a commit that referenced this pull request Sep 15, 2025
Simplify `workqueue_for`. While not strictly necessary, the acquire load
in `getindex(once::OncePerThread{T,F}, tid::Integer)` makes
ThreadSanitizer happy. With the existing implementation, we get false
positives whenever a thread other than the one that originally allocated
the array reads it:

```
==================
WARNING: ThreadSanitizer: data race (pid=6819)
  Atomic read of size 8 at 0xffff86bec058 by main thread:
    #0 getproperty Base_compiler.jl:57 (sys.so+0x113b478)
    #1 julia_pushNOT._1925 task.jl:868 (sys.so+0x113b478)
    #2 julia_enq_work_1896 task.jl:969 (sys.so+0x5cd218)
    #3 schedule task.jl:983 (sys.so+0x892294)
    #4 macro expansion threadingconstructs.jl:522 (sys.so+0x892294)
    #5 julia_start_profile_listener_60681 Base.jl:355 (sys.so+0x892294)
    #6 julia___init___60641 Base.jl:392 (sys.so+0x1178dc)
    #7 jfptr___init___60642 <null> (sys.so+0x118134)
    #8 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    #9 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    #10 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0xbba74)
    #11 jl_module_run_initializer /home/user/c/julia/src/toplevel.c:68:13 (libjulia-internal.so.1.13+0xbba74)
    JuliaLang#12 _finish_jl_init_ /home/user/c/julia/src/init.c:632:13 (libjulia-internal.so.1.13+0x9c0fc)
    JuliaLang#13 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    JuliaLang#14 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    JuliaLang#15 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    JuliaLang#16 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

  Previous write of size 8 at 0xffff86bec058 by thread T2:
    #0 IntrusiveLinkedListSynchronized task.jl:863 (sys.so+0x78d220)
    #1 macro expansion task.jl:932 (sys.so+0x78d220)
    #2 macro expansion lock.jl:376 (sys.so+0x78d220)
    #3 julia_workqueue_for_1933 task.jl:924 (sys.so+0x78d220)
    #4 julia_wait_2048 task.jl:1204 (sys.so+0x6255ac)
    #5 julia_task_done_hook_49205 task.jl:839 (sys.so+0x128fdc0)
    #6 jfptr_task_done_hook_49206 <null> (sys.so+0x902218)
    #7 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5e9a4)
    #8 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5e9a4)
    #9 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9c79c)
    #10 jl_finish_task /home/user/c/julia/src/task.c:345:13 (libjulia-internal.so.1.13+0x9c79c)
    #11 jl_threadfun /home/user/c/julia/src/scheduler.c:122:5 (libjulia-internal.so.1.13+0xe7db8)

  Thread T2 (tid=6824, running) created by main thread at:
    #0 pthread_create <null> (julia+0x85f88)
    #1 uv_thread_create_ex /workspace/srcdir/libuv/src/unix/thread.c:172 (libjulia-internal.so.1.13+0x1a8d70)
    #2 _finish_jl_init_ /home/user/c/julia/src/init.c:618:5 (libjulia-internal.so.1.13+0x9c010)
    #3 ijl_init_ /home/user/c/julia/src/init.c:783:5 (libjulia-internal.so.1.13+0x9bcf4)
    #4 jl_repl_entrypoint /home/user/c/julia/src/jlapi.c:1125:5 (libjulia-internal.so.1.13+0xf7ec8)
    #5 jl_load_repl /home/user/c/julia/cli/loader_lib.c:601:12 (libjulia.so.1.13+0x11934)
    #6 main /home/user/c/julia/cli/loader_exe.c:58:15 (julia+0x10dc20)

SUMMARY: ThreadSanitizer: data race Base_compiler.jl:57 in getproperty
==================
```
adienes pushed a commit that referenced this pull request Oct 10, 2025
Use an atomic fetch and add to fix a data race in `Module()` identified
by tsan:

```
./usr/bin/julia -t4,0 --gcthreads=1 -e 'Threads.@threads for i=1:100 Module() end'
==================
WARNING: ThreadSanitizer: data race (pid=5575)
  Write of size 4 at 0xffff9bf9bd28 by thread T9:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Previous write of size 4 at 0xffff9bf9bd28 by thread T10:
    #0 jl_new_module__ /home/user/c/julia/src/module.c:487:22 (libjulia-internal.so.1.13+0x897d4)
    #1 jl_new_module_ /home/user/c/julia/src/module.c:527:22 (libjulia-internal.so.1.13+0x897d4)
    #2 jl_f_new_module /home/user/c/julia/src/module.c:649:22 (libjulia-internal.so.1.13+0x8a968)
    #3 <null> <null> (0xffff76a21164)
    #4 <null> <null> (0xffff76a1f074)
    #5 <null> <null> (0xffff76a1f0c4)
    #6 _jl_invoke /home/user/c/julia/src/gf.c (libjulia-internal.so.1.13+0x5ea04)
    #7 ijl_apply_generic /home/user/c/julia/src/gf.c:3892:12 (libjulia-internal.so.1.13+0x5ea04)
    #8 jl_apply /home/user/c/julia/src/julia.h:2343:12 (libjulia-internal.so.1.13+0x9e4c4)
    #9 start_task /home/user/c/julia/src/task.c:1249:19 (libjulia-internal.so.1.13+0x9e4c4)

  Location is global 'jl_new_module__.mcounter' of size 4 at 0xffff9bf9bd28 (libjulia-internal.so.1.13+0x3dbd28)
```

(cherry picked from commit 9039555)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant