Written by Claude Opus 4.7
Summary
When MKLSparse is loaded, the type-pirated * for SparseMatrixCSC{Float64,Int64} returns a result whose row indices are not sorted within each column. This violates the stdlib invariant for SparseMatrixCSC (rowvals must be strictly sorted within each column) and silently breaks any downstream stdlib operation that relies on it — +, getindex, findnz-based equality, dropzeros!, etc.
The result C = A * B is a SparseMatrixCSC, so user code has no signal that it should be treated specially. Subsequent C[i,j] returns 0.0 for entries that are structurally stored, and C + D mis-merges entries and produces numerically wrong sums.
Environment
- Julia 1.12.5 (x86_64-linux-gnu)
MKLSparse v3.0.0
MKL_jll v2025.2.0+0
- CPU: Intel Sapphire Rapids
- OS: Linux 5.14.0 (RHEL 9)
MWE
using SparseArrays, Random, MKLSparse
rng = MersenneTwister(0xdeadbeed)
A = sprand(rng, 50, 50, 0.1)
B = sprand(rng, 50, 50, 0.1)
C = A * B # dispatched to MKLSparse via type piracy
# Invariant: SparseMatrixCSC requires row indices within each column to be
# strictly sorted. Stdlib `A * B` produces sorted output. MKLSparse does not.
sorted_per_col = all(issorted(view(C.rowval, C.colptr[j]:C.colptr[j+1]-1))
for j in 1:size(C, 2))
println("rowvals sorted per column? ", sorted_per_col)
# => false
# Downstream consequence #1 — getindex via binary search returns 0 for entries
# that are actually stored (binary search assumes sorted rowvals).
j = findfirst(j -> !issorted(view(C.rowval, C.colptr[j]:C.colptr[j+1]-1)),
1:size(C, 2))
rng_j = C.colptr[j]:C.colptr[j+1]-1
i = first(i for i in C.rowval[rng_j] if C[i, j] == 0.0)
k = findfirst(==(i), C.rowval[rng_j])
println("C[$i, $j] is structurally stored with value ", C.nzval[rng_j[k]])
println("but C[$i, $j] (via getindex) returns ", C[i, j])
# => stored value ≈ 0.27, but getindex returns 0.0
# Downstream consequence #2 — sparse `A + B` mis-merges entries when one
# operand has unsorted rowvals, producing numerically wrong sums.
C0 = sprand(rng, 50, 50, 0.05)
sum_sparse = (2.0 .* C) .+ (3.0 .* C0)
sum_dense = (2.0 .* Matrix(C)) .+ (3.0 .* Matrix(C0))
println("Matrix(2C + 3C0) ≈ 2·Matrix(C) + 3·Matrix(C0)? ",
Matrix(sum_sparse) ≈ sum_dense)
println(" max abs entry-wise diff: ",
maximum(abs, Matrix(sum_sparse) .- sum_dense))
# => false; max abs diff ≈ 2.95
Observed output
rowvals sorted per column? false
C[22, 1] is structurally stored with value 0.27390292139245087,
but C[22, 1] (via getindex) returns 0.0
Matrix(2C + 3C0) ≈ 2·Matrix(C) + 3·Matrix(C0)? false
max abs entry-wise diff: 2.9544181517859176
Expected behaviour
A * B for SparseMatrixCSC should return a SparseMatrixCSC that satisfies the stdlib invariant: row indices within each column are strictly increasing. Without using MKLSparse, that is what stdlib's A * B does, and all of the assertions above pass.
Suggested fix
After computing the SpGEMM result, the wrapper that constructs the SparseMatrixCSC from MKL's output (likely mkl_sparse_d_export_csr / mkl_sparse_d_export_csc followed by allocation of colptr / rowval / nzval) should either:
- Sort each column's
rowval slice (and the corresponding nzval) — e.g. via SparseArrays.sortSparseMatrixCSC! (an internal but well-tested helper), or a manual per-column sortperm! over the row-index slice.
- Or, configure MKL's export to produce sorted output if such an option exists (
mkl_sparse_order may be relevant).
Locally I verified that re-sorting each column post-multiplication makes both downstream assertions pass.
Written by Claude Opus 4.7
Summary
When
MKLSparseis loaded, the type-pirated*forSparseMatrixCSC{Float64,Int64}returns a result whose row indices are not sorted within each column. This violates the stdlib invariant forSparseMatrixCSC(rowvals must be strictly sorted within each column) and silently breaks any downstream stdlib operation that relies on it —+,getindex,findnz-based equality,dropzeros!, etc.The result
C = A * Bis aSparseMatrixCSC, so user code has no signal that it should be treated specially. SubsequentC[i,j]returns0.0for entries that are structurally stored, andC + Dmis-merges entries and produces numerically wrong sums.Environment
MKLSparsev3.0.0MKL_jllv2025.2.0+0MWE
Observed output
Expected behaviour
A * BforSparseMatrixCSCshould return aSparseMatrixCSCthat satisfies the stdlib invariant: row indices within each column are strictly increasing. Withoutusing MKLSparse, that is what stdlib'sA * Bdoes, and all of the assertions above pass.Suggested fix
After computing the SpGEMM result, the wrapper that constructs the
SparseMatrixCSCfrom MKL's output (likelymkl_sparse_d_export_csr/mkl_sparse_d_export_cscfollowed by allocation ofcolptr/rowval/nzval) should either:rowvalslice (and the correspondingnzval) — e.g. viaSparseArrays.sortSparseMatrixCSC!(an internal but well-tested helper), or a manual per-columnsortperm!over the row-index slice.mkl_sparse_ordermay be relevant).Locally I verified that re-sorting each column post-multiplication makes both downstream assertions pass.