The example examples/vadd.jl should vectorize, but the look at the output of @device_code_native ... shows no vector instruction(s).
There are two reasons for this behavior:
- in our LLVM source tree LoopVectorizer is disabled by means of cost function in order to not interfere with the RegionVectorizer (RV).
- Julia GPUCompiler uses an own set of optimization passes which does not call RV.
Either reenable LoopVectorizer or call RV in the optimization step.
The example
examples/vadd.jlshould vectorize, but the look at the output of@device_code_native ...shows no vector instruction(s).There are two reasons for this behavior:
Either reenable LoopVectorizer or call RV in the optimization step.