Skip to content

Commit a551e82

Browse files
committed
Set cuda context flag CU_CTX_SCHED_BLOCKING_SYNC
- Expected by libomptarget to avoid a warning. Should not affect correctness. Makes GPU execution more CPU-efficient by avoiding spinning on GPU sync.
1 parent 481e63f commit a551e82

1 file changed

Lines changed: 4 additions & 0 deletions

File tree

src/numba/openmp/omp_ir.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,11 @@ def __init__(self):
113113
import numba.cuda.api as cudaapi
114114
import numba.cuda.cudadrv.libs as cudalibs
115115
from numba.cuda.codegen import CUDA_TRIPLE
116+
from numba.cuda.cudadrv import driver, enums
116117

118+
# The OpenMP target runtime prefers the blocking sync flag, so we set it
119+
# here before creating the CUDA context.
120+
driver.driver.cuDevicePrimaryCtxSetFlags(0, enums.CU_CTX_SCHED_BLOCKING_SYNC)
117121
self.cc = cudaapi.get_current_device().compute_capability
118122
self.sm = "sm_" + str(self.cc[0]) + str(self.cc[1])
119123

0 commit comments

Comments
 (0)