Pirated `*` returns `SparseMatrixCSC` with unsorted rowvals per column

Written by Claude Opus 4.7

## Summary

When `MKLSparse` is loaded, the type-pirated `*` for `SparseMatrixCSC{Float64,Int64}` returns a result whose row indices are **not** sorted within each column. This violates the [stdlib invariant for `SparseMatrixCSC`][csc-invariant] (rowvals must be strictly sorted within each column) and silently breaks any downstream stdlib operation that relies on it — `+`, `getindex`, `findnz`-based equality, `dropzeros!`, etc.

The result `C = A * B` is a `SparseMatrixCSC`, so user code has no signal that it should be treated specially. Subsequent `C[i,j]` returns `0.0` for entries that are structurally stored, and `C + D` mis-merges entries and produces numerically wrong sums.

[csc-invariant]: https://github.com/JuliaSparse/SparseArrays.jl/blob/main/src/sparsematrix.jl

## Environment

- Julia 1.12.5 (x86_64-linux-gnu)
- `MKLSparse` v3.0.0
- `MKL_jll` v2025.2.0+0
- CPU: Intel Sapphire Rapids
- OS: Linux 5.14.0 (RHEL 9)

## MWE

```julia
using SparseArrays, Random, MKLSparse

rng = MersenneTwister(0xdeadbeed)
A = sprand(rng, 50, 50, 0.1)
B = sprand(rng, 50, 50, 0.1)

C = A * B   # dispatched to MKLSparse via type piracy

# Invariant: SparseMatrixCSC requires row indices within each column to be
# strictly sorted. Stdlib `A * B` produces sorted output. MKLSparse does not.
sorted_per_col = all(issorted(view(C.rowval, C.colptr[j]:C.colptr[j+1]-1))
                     for j in 1:size(C, 2))
println("rowvals sorted per column? ", sorted_per_col)
# => false

# Downstream consequence #1 — getindex via binary search returns 0 for entries
# that are actually stored (binary search assumes sorted rowvals).
j = findfirst(j -> !issorted(view(C.rowval, C.colptr[j]:C.colptr[j+1]-1)),
              1:size(C, 2))
rng_j = C.colptr[j]:C.colptr[j+1]-1
i = first(i for i in C.rowval[rng_j] if C[i, j] == 0.0)
k = findfirst(==(i), C.rowval[rng_j])
println("C[$i, $j] is structurally stored with value ", C.nzval[rng_j[k]])
println("but C[$i, $j] (via getindex) returns ", C[i, j])
# => stored value ≈ 0.27, but getindex returns 0.0

# Downstream consequence #2 — sparse `A + B` mis-merges entries when one
# operand has unsorted rowvals, producing numerically wrong sums.
C0 = sprand(rng, 50, 50, 0.05)
sum_sparse = (2.0 .* C) .+ (3.0 .* C0)
sum_dense  = (2.0 .* Matrix(C)) .+ (3.0 .* Matrix(C0))
println("Matrix(2C + 3C0) ≈ 2·Matrix(C) + 3·Matrix(C0)? ",
        Matrix(sum_sparse) ≈ sum_dense)
println("  max abs entry-wise diff: ",
        maximum(abs, Matrix(sum_sparse) .- sum_dense))
# => false; max abs diff ≈ 2.95
```

## Observed output

```
rowvals sorted per column? false
C[22, 1] is structurally stored with value 0.27390292139245087,
but C[22, 1] (via getindex) returns 0.0
Matrix(2C + 3C0) ≈ 2·Matrix(C) + 3·Matrix(C0)? false
  max abs entry-wise diff: 2.9544181517859176
```

## Expected behaviour

`A * B` for `SparseMatrixCSC` should return a `SparseMatrixCSC` that satisfies the stdlib invariant: row indices within each column are strictly increasing. Without `using MKLSparse`, that is what stdlib's `A * B` does, and all of the assertions above pass.

## Suggested fix

After computing the SpGEMM result, the wrapper that constructs the `SparseMatrixCSC` from MKL's output (likely `mkl_sparse_d_export_csr` / `mkl_sparse_d_export_csc` followed by allocation of `colptr` / `rowval` / `nzval`) should either:

1. Sort each column's `rowval` slice (and the corresponding `nzval`) — e.g. via `SparseArrays.sortSparseMatrixCSC!` (an internal but well-tested helper), or a manual per-column `sortperm!` over the row-index slice.
2. Or, configure MKL's export to produce sorted output if such an option exists (`mkl_sparse_order` may be relevant).

Locally I verified that re-sorting each column post-multiplication makes both downstream assertions pass.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pirated `*` returns `SparseMatrixCSC` with unsorted rowvals per column #61

Summary

Environment

MWE

Observed output

Expected behaviour

Suggested fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pirated * returns SparseMatrixCSC with unsorted rowvals per column #61

Description

Summary

Environment

MWE

Observed output

Expected behaviour

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pirated `*` returns `SparseMatrixCSC` with unsorted rowvals per column #61