Skip to content

Enable libcint 4-center 1-electron overlap integrals (WITH_4C1E) #82

@joshkamm

Description

@joshkamm

Context

For DFT inversion work, we need 4-center 1-electron overlap integrals:

∫ φ_i(r) φ_j(r) φ_k(r) φ_l(r) dr

These are distinct from the 2-electron ERIs (ij|1/r₁₂|kl) that SlaterGPU already wraps via gen_eri. Same 4-index structure, but a 1-electron integral with no Coulomb operator.

libcint provides this via cint4c1e_sph / cint4c1e_cart (available since libcint v2.8.16, implemented in src/cint4c1e.c and src/g4c1e.c), but these functions are not compiled by default. Paul has a working prototype in his personal (unversioned) code — including both a libcint cache bug fix and a SlaterGPU-style wrapper — that produces reasonable DFT inversion results.

Priority 1: Enable -DWITH_4C1E=1 in the build

The CMake flag -DWITH_4C1E=1 must be passed when building libcint for these functions to be available. Update SlaterGPU's pixi.toml and/or CMake configuration so that libcint is built with this flag. This lets us maintain a unified build rather than requiring a separate manual libcint compilation.

Future work: Integrate Paul's wrapper

Paul already has a working cint4c1e wrapper in his local code that follows the gen_eri pattern in cintwrapper.cpp. This needs to be brought into the repo. The wrapper:

  • Loops over all 4 shell quartets
  • Calls cint4c1e_cart or cint4c1e_sph depending on BT::DO_CART
  • Unpacks buffer with standard column-major indexing
  • Full permutational symmetry over all 4 indices

Future work: Known upstream libcint bug (cache buffer sizing)

libcint's own CMakeLists.txt contains the warning: "Note there are bugs in 4c1e integral functions".

The likely bug is a cache buffer sizing issue in src/cint4c1e.c and src/g4c1e.c:

  • CINTg4c1e_ovlp (in g4c1e.c) needs 3 * b_size doubles of scratch space from the cache pointer, where b_size = db * (MAX(nmax, mmax) + 1) and db = nmax + mmax + 1 (with nmax = li + lj, mmax = lk + ll).
  • The cache size calculations in CINT4c1e_drv (both the size-query path when out == NULL at ~line 270, and the self-allocation path when cache == NULL at ~line 278) do not include this 3 * b_size term.
  • CINT4c1e_loop_nopt (~line 108) has the same omission in its len calculation.
  • For low angular momentum shells, there's enough slack in the leng allocation that it works by accident. For higher angular momentum (roughly li+lj+lk+ll >= 6), bufx/bufy/bufz can write past the end of the cache — a heap buffer overflow.

Proposed fix (applied by Paul in his prototype, not yet validated with tests): add ovlp_scratch = 3 * db * (MAX(nmax, mmax) + 1) to the len/cache_size calculations in all 4 affected locations in cint4c1e.c. The values nmax, mmax are available from envs->li_ceil + envs->lj_ceil and envs->lk_ceil + envs->ll_ceil.

This fix has not been tested — libcint has a test (testsuite/test_cint4c1e.py) that validates cint4c1e_sph against a Laplacian-based identity using cint2e_ipip1_sph and cint2e_ipvip1_sph. That should be run to confirm correctness. We should also consider filing the fix upstream on sunqm/libcint.

References

  • libcint 4c1e source: src/cint4c1e.c, src/g4c1e.c in sunqm/libcint
  • libcint test: testsuite/test_cint4c1e.py
  • SlaterGPU ERI wrapper (pattern to follow): src/libcintw/cintwrapper.cpp, gen_eri function
  • libcint build flag: CMakeLists.txt option WITH_4C1E

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions