I wrote a Cuda implementation for the convolution in order to do fault injection on a CNN but the injector does not detect any instruction. I use Pytorch C++ as a framework.
When I open the file stdout.txt in the directory where I store the executable, I get the following output:
inspecting forward_cuda_kernel(at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 1ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, int, int, int, int, int, int) - num instrs 456
and a very long list of instructions (more than 1300) terminating with
NVBit-igprofile; ERROR FAIL in kernel execution!!
I was wondering what could be the meaning of this and how to solve it.
I wrote a Cuda implementation for the convolution in order to do fault injection on a CNN but the injector does not detect any instruction. I use Pytorch C++ as a framework.
When I open the file stdout.txt in the directory where I store the executable, I get the following output:
inspecting forward_cuda_kernel(at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 1ul, at::RestrictPtrTraits, int>, at::GenericPackedTensorAccessor<float, 4ul, at::RestrictPtrTraits, int>, int, int, int, int, int, int) - num instrs 456
and a very long list of instructions (more than 1300) terminating with
NVBit-igprofile; ERROR FAIL in kernel execution!!
I was wondering what could be the meaning of this and how to solve it.