Skip to content

TDA-ris capable of 30000 nbf (3000 atoms) #615

Open
John-zzh wants to merge 44 commits intopyscf:masterfrom
John-zzh:clean2
Open

TDA-ris capable of 30000 nbf (3000 atoms) #615
John-zzh wants to merge 44 commits intopyscf:masterfrom
John-zzh:clean2

Conversation

@John-zzh
Copy link
Contributor

@John-zzh John-zzh commented Dec 28, 2025

major updates compared to 1.4.1

  1. 3c2e engine updated with 3c2e_bdiv
  2. on the fly Coulomb iajb instead of storing tensor
  3. store Davidson trial vectors in CPU mem
  4. many other optimizations

As of date, some version of MKL-numpy is not compatible with cupy 13.4.1 and cutensor 2.2.0.0, contract engine will complaine about STATUS,
while openblas-numpy-scipy has segmentation fault when solving large FC=SCE.

To use TDA-ris on large system, use

mamba install -c conda-forge \
  cupy=13.4.1 \
  cutensor=2.2.0 \
  numpy scipy \
  blas=*=*mkl
(cupy_mkl_py311) root@volcengine:~/jojo/software/gpu4pyscf_ris/gpu4pyscf/tdscf# ls /root/jojo/miniconda3/envs/cupy_mkl_py311/lib | grep blas
libblas.so
libblas.so.3
libcblas.so
libcblas.so.3
libcublasLt.so.12
libcublasLt.so.12.9.1.4
libcublas.so.12
libcublas.so.12.9.1.4
libnvblas.so.12
libnvblas.so.12.9.1.4
 python -c "import numpy as np; np.show_config()"
... 
  "Build Dependencies": {
    "blas": {
      "name": "blas",
      "found": true,
      "version": "3.9.0",
      "detection method": "pkgconfig",
      "include directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/include",
      "lib directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/lib",
      "openblas configuration": "unknown",
      "pc file directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/lib/pkgconfig"
    },
    "lapack": {
      "name": "lapack",
      "found": true,
      "version": "3.9.0",
      "detection method": "pkgconfig",
      "include directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/include",
      "lib directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/lib",
      "openblas configuration": "unknown",
      "pc file directory": "/root/jojo/miniconda3/envs/cupy_mkl_py311/lib/pkgconfig"
    }
...

@jeanwsr
Copy link

jeanwsr commented Dec 28, 2025

Wraning, this will not install a mkl-numpy-scipy, the backend is an unknown verision of BLAS,

What if you replace ls by ls -l ? If you see libblas.so is a link to libmkl_rt.so, then numpy is really using mkl.

@John-zzh
Copy link
Contributor Author

John-zzh commented Dec 28, 2025

Wraning, this will not install a mkl-numpy-scipy, the backend is an unknown verision of BLAS,

What if you replace ls by ls -l ? If you see libblas.so is a link to libmkl_rt.so, then numpy is really using mkl.

you are right, seems it is really using MKL.

(cupy_mkl_py311) root@volcengine:~/jojo/software/gpu4pyscf_ris/gpu4pyscf/tdscf# ls -l /root/jojo/miniconda3/envs/cupy_mkl_py311/lib | grep mkl
lrwxrwxrwx  1 root root        14 Dec 25 16:49 libblas.so -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 libblas.so.3 -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 libcblas.so -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 libcblas.so.3 -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 liblapacke.so -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 liblapacke.so.3 -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 liblapack.so -> libmkl_rt.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 liblapack.so.3 -> libmkl_rt.so.2
-rwxrwxr-x  3 root root  49788392 Oct 11 02:32 libmkl_avx2.so.2
-rwxrwxr-x  3 root root  73555016 Oct 11 02:32 libmkl_avx512.so.2
lrwxrwxrwx  1 root root        32 Dec 25 16:49 libmkl_blacs_intelmpi_ilp64.so -> libmkl_blacs_intelmpi_ilp64.so.2
-rwxrwxr-x  3 root root    581872 Oct 11 02:32 libmkl_blacs_intelmpi_ilp64.so.2
lrwxrwxrwx  1 root root        31 Dec 25 16:49 libmkl_blacs_intelmpi_lp64.so -> libmkl_blacs_intelmpi_lp64.so.2
-rwxrwxr-x  3 root root    305744 Oct 11 02:32 libmkl_blacs_intelmpi_lp64.so.2
lrwxrwxrwx  1 root root        31 Dec 25 16:49 libmkl_blacs_openmpi_ilp64.so -> libmkl_blacs_openmpi_ilp64.so.2
-rwxrwxr-x  3 root root    586312 Oct 11 02:32 libmkl_blacs_openmpi_ilp64.so.2
lrwxrwxrwx  1 root root        30 Dec 25 16:49 libmkl_blacs_openmpi_lp64.so -> libmkl_blacs_openmpi_lp64.so.2
-rwxrwxr-x  3 root root    301992 Oct 11 02:32 libmkl_blacs_openmpi_lp64.so.2
lrwxrwxrwx  1 root root        21 Dec 25 16:49 libmkl_cdft_core.so -> libmkl_cdft_core.so.2
-rwxrwxr-x  3 root root    195056 Oct 11 02:32 libmkl_cdft_core.so.2
lrwxrwxrwx  1 root root        16 Dec 25 16:49 libmkl_core.so -> libmkl_core.so.2
-rwxrwxr-x  3 root root  72157040 Oct 11 02:32 libmkl_core.so.2
-rwxrwxr-x  3 root root  41036168 Oct 11 02:32 libmkl_def.so.2
lrwxrwxrwx  1 root root        20 Dec 25 16:49 libmkl_gf_ilp64.so -> libmkl_gf_ilp64.so.2
-rwxrwxr-x  3 root root  15185440 Oct 11 02:32 libmkl_gf_ilp64.so.2
lrwxrwxrwx  1 root root        19 Dec 25 16:49 libmkl_gf_lp64.so -> libmkl_gf_lp64.so.2
-rwxrwxr-x  3 root root  17181208 Oct 11 02:32 libmkl_gf_lp64.so.2
lrwxrwxrwx  1 root root        22 Dec 25 16:49 libmkl_gnu_thread.so -> libmkl_gnu_thread.so.2
-rwxrwxr-x  3 root root  28829056 Oct 11 02:32 libmkl_gnu_thread.so.2
lrwxrwxrwx  1 root root        23 Dec 25 16:49 libmkl_intel_ilp64.so -> libmkl_intel_ilp64.so.2
-rwxrwxr-x  3 root root  18290680 Oct 11 02:32 libmkl_intel_ilp64.so.2
lrwxrwxrwx  1 root root        22 Dec 25 16:49 libmkl_intel_lp64.so -> libmkl_intel_lp64.so.2
-rwxrwxr-x  3 root root  20294504 Oct 11 02:32 libmkl_intel_lp64.so.2
lrwxrwxrwx  1 root root        24 Dec 25 16:49 libmkl_intel_thread.so -> libmkl_intel_thread.so.2
-rwxrwxr-x  3 root root  42925416 Oct 11 02:32 libmkl_intel_thread.so.2
-rwxrwxr-x  3 root root  49999992 Oct 11 02:32 libmkl_mc3.so.2
lrwxrwxrwx  1 root root        14 Dec 25 16:49 libmkl_rt.so -> libmkl_rt.so.2
-rwxrwxr-x  3 root root  18291704 Oct 11 02:32 libmkl_rt.so.2
lrwxrwxrwx  1 root root        27 Dec 25 16:49 libmkl_scalapack_ilp64.so -> libmkl_scalapack_ilp64.so.2
-rwxrwxr-x  3 root root   7675808 Oct 11 02:32 libmkl_scalapack_ilp64.so.2
lrwxrwxrwx  1 root root        26 Dec 25 16:49 libmkl_scalapack_lp64.so -> libmkl_scalapack_lp64.so.2
-rwxrwxr-x  3 root root   7695368 Oct 11 02:32 libmkl_scalapack_lp64.so.2
lrwxrwxrwx  1 root root        22 Dec 25 16:49 libmkl_sequential.so -> libmkl_sequential.so.2
-rwxrwxr-x  3 root root  25645064 Oct 11 02:32 libmkl_sequential.so.2
lrwxrwxrwx  1 root root        22 Dec 25 16:49 libmkl_tbb_thread.so -> libmkl_tbb_thread.so.2
-rwxrwxr-x  3 root root  36164656 Oct 11 02:32 libmkl_tbb_thread.so.2
-rwxrwxr-x  3 root root  15804448 Oct 11 02:32 libmkl_vml_avx2.so.2
-rwxrwxr-x  3 root root  16413888 Oct 11 02:32 libmkl_vml_avx512.so.2
-rwxrwxr-x  3 root root   8124792 Oct 11 02:32 libmkl_vml_cmpt.so.2
-rwxrwxr-x  3 root root   9461344 Oct 11 02:32 libmkl_vml_def.so.2
-rwxrwxr-x  3 root root  13247632 Oct 11 02:32 libmkl_vml_mc3.so.2
(cupy_mkl_py311) root@volcengine:~/jojo/software/gpu4pyscf_ris/gpu4pyscf/tdscf# python - << 'EOF'
import ctypes

mkl = ctypes.CDLL('libmkl_rt.so')
buf = ctypes.create_string_buffer(200)
mkl.MKL_Get_Version_String(buf, 200)
print(buf.value.decode())
EOF
Intel(R) oneAPI Math Kernel Library Version 2025.3-Product Build 20251007 for Intel(R) 64 architecture applications

In short,
use conda-forge

mamba install -c conda-forge cupy=13.4.1  cutensor=2.2.0 numpy scipy blas=*=*mkl

and do not use default channel

mamba install -c default  cupy cutensor numpy scipy blas=*=*mkl   # contract engine will complain about CUTENSOR STATUS
mamba install -c default  cupy=13.4.1 cutensor=2.2.0 numpy scipy blas=*=*mkl  # no such version combination

Maybe conda default channel have verison mismatch with contract engine

@John-zzh
Copy link
Contributor Author

wired, test_ris.py and test_krylov.py works fine from my side

verbose (optional): Verbosity level of the logger. If None, it will use the verbosity of `mf`.
nto_state (None or int, optional): Which state to calculate natural transition orbitals,
require install MOKIT https://jeanwsr.gitlab.io/mokit-doc-mdbook/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use the gitlab url when suggesting installing MOKIT.

def get_nto(self,state_id):

''' need to install MOKIT to dump .fch format orbital file
https://jeanwsr.gitlab.io/mokit-doc-mdbook/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here.

fchk(nto_mf, fchfilename)
from mokit.lib.rwwfn import del_dm_in_fch
del_dm_in_fch(fchname=fchfilename,itype=1)
return nto_mf
Copy link

@jeanwsr jeanwsr Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the get_nto function should only return NTO coeffs, etc. and dump fchk can be done in other functions, to keep the code more organized. Also, in this way it still work when mokit is missing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meaning dump fchk in a independent py file, not in GPU4pyscf repo?

Copy link

@jeanwsr jeanwsr Jan 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, one function get_nto for calculating and returning NTOs only, like the api provided by tddft, and another save_nto for dumping fchk, which can also be placed in your ris class. Could also have an all-in-one function get_nto_and_save which wraps those two.

If you don't want to split them, making save_fch an option in your current function is also ok i guess, which at least allow people get the NTOs when they don't have mokit.

@John-zzh John-zzh closed this Jan 29, 2026
@John-zzh John-zzh reopened this Jan 31, 2026
@John-zzh
Copy link
Contributor Author

It seems install from defaults channel also works, with cupy==13.6.0 cutensor ==2.4.1.4

mamba create -n mkl -c defaults python=3.11
mamba activate mkl

mamba install -c defaults \
 cuda-version=12.8 \
 cupy=13 \
 cutensor=2 \
 numpy=2.2 \
 scipy \
 blas=*=*mkl

H -3.22959 2.35981 -0.24953
'''
mol = gto.M(atom=atom, basis='def2-svp',
# output = '/dev/null', # Suppress excessive log output
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the output='/dev/null'. stdout.close() causes IO error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants