|
| 1 | +.. toctree:: |
| 2 | + :maxdepth: 2 |
| 3 | + |
| 4 | + |
| 5 | +Backends |
| 6 | +======== |
| 7 | + |
| 8 | +Kernel Tuner implements multiple backends for CUDA, one for OpenCL, one for HIP, and a generic |
| 9 | +Compiler backend. |
| 10 | + |
| 11 | +Selecting a backend is in most cases automatic and is done based on the kernel's programming |
| 12 | +language, but sometimes you'll want to specifically choose a backend. |
| 13 | + |
| 14 | + |
| 15 | +CUDA Backends |
| 16 | +------------- |
| 17 | + |
| 18 | +PyCUDA is default CUDA backend in Kernel Tuner. It is comparable in feature completeness with CuPy. |
| 19 | +Because the HIP kernel language is identical to the CUDA kernel language, HIP is included here as well. |
| 20 | +To use HIP on nvidia GPUs, see https://github.com/jatinx/hip-on-nv. |
| 21 | + |
| 22 | +While the PyCUDA backend expects all inputs and outputs to be Numpy arrays, the CuPy backend also |
| 23 | +supports cupy arrays as input and output arguments for the kernels. This gives the user more control |
| 24 | +over how memory is handled by Kernel Tuner. Also checks during output verification can happen |
| 25 | +entirely on the GPU when using only cupy arrays. |
| 26 | + |
| 27 | +Texture memory is only supported by the PyCUDA backend, while the CuPy backend is the only one that |
| 28 | +support C++ signatures for the kernels. With the other backends, it is required that the kernel has |
| 29 | +extern "C" linkage. If not, the entire code is wrapped in an extern "C" block, which may cause issues |
| 30 | +if the code also contains C++ code that cannot have extern "C" linkage, including code that may be |
| 31 | +present in header files. |
| 32 | + |
| 33 | +As detailed further :ref:`templates`, templated kernels are fully supported by the CuPy backend and |
| 34 | +limited support is implemented by Kernel Tuner to support templated kernels for the PyCUDA and |
| 35 | +CUDA-Python backends. |
| 36 | + |
| 37 | + |
| 38 | +.. csv-table:: Backend feature support |
| 39 | + :header: Feature, PyCUDA, CuPy, CUDA-Python, HIP |
| 40 | + :widths: auto |
| 41 | + |
| 42 | + Compile kernels, ✓, ✓, ✓, ✓ |
| 43 | + Benchmark kernels, ✓, ✓, ✓, ✓ |
| 44 | + Observers, ✓, ✓, ✓, ✓ |
| 45 | + Constant memory, ✓, ✓, ✓, ✓ |
| 46 | + Dynamic shared memory, ✓, ✓, ✓, ✓ |
| 47 | + Texture memory, ✓, ✗, ✗, ✗ |
| 48 | + C++ kernel signature, ✗, ✓, ✗, ✗ |
| 49 | + Templated kernels, ✓, ✓, ✓, ✗ |
| 50 | + |
| 51 | + |
| 52 | +Another important difference between the different backends is the compiler that is used. The table |
| 53 | +below lists which Python package is required, how the backend can be selected and which compiler is |
| 54 | +used to compile the kernels. |
| 55 | + |
| 56 | + |
| 57 | +.. csv-table:: Backend usage and compiler |
| 58 | + :header: Feature, PyCUDA, CuPy, CUDA-Python, HIP |
| 59 | + :widths: auto |
| 60 | + |
| 61 | + Python package, "pycuda", "cupy", "cuda-python", "pyhip-interface" |
| 62 | + Selected with lang=, "CUDA", "CUPY", "NVCUDA", "HIP" |
| 63 | + Compiler used, "nvcc", "nvrtc", "nvrtc", "hiprtc" |
| 64 | + |
| 65 | + |
0 commit comments