Often, a particular stencil-based code can contain dozens or even hundreds of variations of the most "complete" stencil. For example, depending on what parameters the simulation is running, it may only need certain aspects of the most complete case.
In C/C++, many codes use some sort of code generation/ metaprogramming to handle the combinatorial explosion that occurs in these cases.
Has anyone explored @parallel kernel generation in this context? I tried creating a @generated function from a @parallel construct, but was running into some issues.
Just curious if others have thought of this. Thanks!
Often, a particular stencil-based code can contain dozens or even hundreds of variations of the most "complete" stencil. For example, depending on what parameters the simulation is running, it may only need certain aspects of the most complete case.
In C/C++, many codes use some sort of code generation/ metaprogramming to handle the combinatorial explosion that occurs in these cases.
Has anyone explored
@parallelkernel generation in this context? I tried creating a@generatedfunction from a@parallelconstruct, but was running into some issues.Just curious if others have thought of this. Thanks!