Skip to content

Commit 4f0aa8f

Browse files
Merge pull request #199 from MiloLurati/HIPbackend
HIP Backend
2 parents b3ff4cd + 0d96807 commit 4f0aa8f

File tree

17 files changed

+753
-57
lines changed

17 files changed

+753
-57
lines changed

INSTALL.rst

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,31 @@ Or you could install Kernel Tuner and PyOpenCL together if you haven't done so a
111111
112112
If this fails, please see the PyOpenCL installation guide (https://wiki.tiker.net/PyOpenCL/Installation)
113113

114+
HIP and PyHIP
115+
-------------
116+
117+
Before we can install PyHIP, you'll need to have the HIP runtime and compiler installed on your system.
118+
The HIP compiler is included as part of the ROCm software stack. Here is AMD's installation guide:
119+
120+
* `ROCm Documentation: HIP Installation Guide <https://docs.amd.com/bundle/HIP-Installation-Guide-v5.3/page/Introduction_to_HIP_Installation_Guide.html>`__
121+
122+
After you've installed HIP, you will need to install PyHIP. Run the following command in your terminal to install:
123+
124+
.. code-block:: bash
125+
126+
pip install pyhip-interface
127+
128+
Alternatively, you can install PyHIP from the source code. First, clone the repository from GitHub:
129+
130+
.. code-block:: bash
131+
132+
git clone https://github.com/jatinx/PyHIP
133+
134+
Then, navigate to the repository directory and run the following command to install:
135+
136+
.. code-block:: bash
137+
138+
python setup.py install
114139
115140
Installing the git version
116141
--------------------------
@@ -128,6 +153,7 @@ You can install Kernel Tuner with several optional dependencies, the full list i
128153

129154
- `cuda`: install pycuda along with kernel_tuner
130155
- `opencl`: install pycuda along with kernel_tuner
156+
- `hip`: install pyhip along with kernel_tuner
131157
- `doc`: installs packages required to build the documentation
132158
- `tutorial`: install packages required to run the guides
133159
- `dev`: install everything you need to start development on Kernel Tuner

README.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,14 @@ To tune OpenCL kernels:
2828
- First, make sure you have an OpenCL compiler for your intended OpenCL platform
2929
- Then type: ``pip install kernel_tuner[opencl]``
3030

31-
Or both:
31+
To tune HIP kernels:
3232

33-
- ``pip install kernel_tuner[cuda,opencl]``
33+
- First, make sure you have an HIP runtime and compiler installed
34+
- Then type: ``pip install kernel_tuner[hip]``
35+
36+
Or all:
37+
38+
- ``pip install kernel_tuner[cuda,opencl,hip]``
3439

3540
More information about how to install Kernel Tuner and its
3641
dependencies can be found in the `installation guide

doc/source/architecture.png

32.3 KB
Loading

doc/source/design.rst

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ The Kernel Tuner is designed to be extensible and support
1212
different search and execution strategies. The current architecture of
1313
the Kernel Tuner can be seen as:
1414

15-
.. image:: architecture_0.4.3.png
15+
.. image:: architecture.png
1616
:width: 500pt
1717

1818
At the top we have the kernel code and the Python script that tunes it,
@@ -48,7 +48,7 @@ building blocks for implementing runners.
4848
The observers are explained in :ref:`observers`.
4949

5050
At the bottom, the backends are shown.
51-
PyCUDA, CuPy, cuda-python and PyOpenCL are for tuning either CUDA or OpenCL kernels.
51+
PyCUDA, CuPy, cuda-python, PyOpenCL and PyHIP are for tuning either CUDA, OpenCL, or HIP kernels.
5252
The C
5353
Functions implementation can actually call any compiler, typically NVCC
5454
or GCC is used. There is limited support for tuning Fortran kernels.
@@ -128,6 +128,12 @@ kernel_tuner.backends.c.CFunctions
128128
:special-members: __init__
129129
:members:
130130

131+
kernel_tuner.backends.hip.HipFunctions
132+
~~~~~~~~~~~~~~~~~~~~~~~~~
133+
.. autoclass:: kernel_tuner.backends.hip.HipFunctions
134+
:special-members: __init__
135+
:members:
136+
131137

132138
Util Functions
133139
--------------

doc/source/index.rst

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,14 @@ To tune OpenCL kernels:
2727
- First, make sure you have an OpenCL compiler for your intended OpenCL platform
2828
- Then type: ``pip install kernel_tuner[opencl]``
2929

30-
Or both:
30+
To tune HIP kernels:
3131

32-
- ``pip install kernel_tuner[cuda,opencl]``
32+
- First, make sure you have an HIP runtime and compiler installed
33+
- Then type: ``pip install kernel_tuner[hip]``
34+
35+
Or all:
36+
37+
- ``pip install kernel_tuner[cuda,opencl,hip]``
3338

3439
More information about how to install Kernel Tuner and its
3540
dependencies can be found under :ref:`install`.

examples/hip/test_vector_add.py

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
#!/usr/bin/env python
2+
"""Minimal example for a HIP Kernel unit test with the Kernel Tuner"""
3+
4+
import numpy
5+
from kernel_tuner import run_kernel
6+
import pytest
7+
8+
#Check pyhip is installed and if a HIP capable device is present, if not skip the test
9+
try:
10+
from pyhip import hip, hiprtc
11+
except ImportError:
12+
pytest.skip("PyHIP not installed or PYTHONPATH does not includes PyHIP")
13+
hip = None
14+
hiprtc = None
15+
16+
def test_vector_add():
17+
18+
kernel_string = """
19+
__global__ void vector_add(float *c, float *a, float *b, int n) {
20+
int i = blockIdx.x * block_size_x + threadIdx.x;
21+
if (i<n) {
22+
c[i] = a[i] + b[i];
23+
}
24+
}
25+
"""
26+
27+
size = 10000000
28+
problem_size = (size, 1)
29+
30+
a = numpy.random.randn(size).astype(numpy.float32)
31+
b = numpy.random.randn(size).astype(numpy.float32)
32+
c = numpy.zeros_like(b)
33+
n = numpy.int32(size)
34+
35+
args = [c, a, b, n]
36+
params = {"block_size_x": 512}
37+
38+
answer = run_kernel("vector_add", kernel_string, problem_size, args, params, lang="HIP")
39+
40+
assert numpy.allclose(answer[0], a+b, atol=1e-8)

examples/hip/vector_add.py

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
#!/usr/bin/env python
2+
"""This is the minimal example from the README"""
3+
4+
import numpy
5+
from kernel_tuner import tune_kernel
6+
from kernel_tuner.file_utils import store_output_file, store_metadata_file
7+
import logging
8+
from collections import OrderedDict
9+
10+
def tune():
11+
12+
kernel_string = """
13+
__global__ void vector_add(float *c, float *a, float *b, int n) {
14+
int i = blockIdx.x * block_size_x + threadIdx.x;
15+
if (i<n) {
16+
c[i] = a[i] + b[i];
17+
}
18+
}
19+
"""
20+
21+
size = 10000000
22+
23+
a = numpy.random.randn(size).astype(numpy.float32)
24+
b = numpy.random.randn(size).astype(numpy.float32)
25+
c = numpy.zeros_like(b)
26+
n = numpy.int32(size)
27+
28+
args = [c, a, b, n]
29+
30+
tune_params = OrderedDict()
31+
tune_params["block_size_x"] = [128+64*i for i in range(15)]
32+
33+
results, env = tune_kernel("vector_add", kernel_string, size, args, tune_params, lang="HIP",
34+
cache="vector_add_cache.json", log=logging.DEBUG)
35+
36+
# Store the metadata of this run
37+
store_metadata_file("vector_add-metadata.json")
38+
39+
return results
40+
41+
42+
if __name__ == "__main__":
43+
tune()
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
#!/usr/bin/env python
2+
"""This is the minimal example from the README"""
3+
4+
import numpy
5+
from kernel_tuner import tune_kernel
6+
from kernel_tuner.file_utils import store_output_file, store_metadata_file
7+
import logging
8+
from collections import OrderedDict
9+
import os
10+
11+
def tune():
12+
13+
kernel_string = """
14+
__global__ void vector_add(float *c, float *a, float *b, int n) {
15+
int i = blockIdx.x * block_size_x + threadIdx.x;
16+
if (i<n) {
17+
c[i] = a[i] + b[i];
18+
}
19+
}
20+
"""
21+
22+
size = 10000000
23+
24+
a = numpy.random.randn(size).astype(numpy.float32)
25+
b = numpy.random.randn(size).astype(numpy.float32)
26+
c = numpy.zeros_like(b)
27+
n = numpy.int32(size)
28+
29+
args = [c, a, b, n]
30+
31+
tune_params = OrderedDict()
32+
tune_params["block_size_x"] = [128+64*i for i in range(15)]
33+
34+
filename = "vector_add_cache.json"
35+
if os.path.isfile(filename):
36+
results, env = tune_kernel("vector_add", kernel_string, size, args, tune_params,
37+
strategy="random_sample", strategy_options=dict(max_fevals=10),
38+
lang="HIP", simulation_mode=True, cache="vector_add_cache.json")
39+
40+
else:
41+
print(f"{filename} does not exist in the directory, run vector_add.py first.")
42+
43+
44+
if __name__ == "__main__":
45+
tune()

0 commit comments

Comments
 (0)