Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
6cbe966
For #1823. Add Kernel symbols when Kernel is created.
arporter Jan 21, 2026
ba8f599
#1823 first sketch of mixin [skip ci]
arporter Jan 21, 2026
5b4b68b
#1823 WIP fixing module-inlining for multiple kernel calls [skip ci]
arporter Jan 21, 2026
95b5710
#1823 WIP removing .module_inline setter and fixing tests [skip ci]
arporter Jan 22, 2026
1bb59eb
#1823 fix module-inline tests
arporter Jan 23, 2026
3811893
#1823 WIP fixing tests [skip ci]
arporter Jan 23, 2026
3082aa8
#1823 WIP fixing more tests [skip ci]
arporter Jan 26, 2026
2ae13c8
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Jan 26, 2026
564e46d
#1823 more test fixes [skip ci]
arporter Jan 26, 2026
134cd05
#1823 fix lint [skip ci]
arporter Jan 26, 2026
06341f7
#1823 WIP fixing tests [skip ci]
arporter Jan 26, 2026
0230c2f
#1823 more test fixing
arporter Jan 27, 2026
8fec073
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Jan 27, 2026
240f86d
#1823 WIP fixing examples [skip ci]
arporter Jan 27, 2026
b269f7a
#1823 add docstrings to new mixin
arporter Jan 27, 2026
5a88b29
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Feb 2, 2026
9d8ac4a
#1823 begin adding new tests [skip ci]
arporter Feb 2, 2026
1c07cb1
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Feb 4, 2026
2e0dbc1
#1823 get coverage of new mixin
arporter Feb 4, 2026
f73cd68
#1823 remove unused/unreachable code and reinstate some tests
arporter Feb 4, 2026
2eaf62c
#1823 allow for ContainerSymbol from which kernel is imported to be i…
arporter Feb 4, 2026
7d8ada3
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Feb 11, 2026
c5e1211
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Feb 19, 2026
001e8e5
#3294 update documentation
arporter Feb 19, 2026
24072f4
#3294 rm kernel-naming option from source
arporter Feb 19, 2026
7c2e8e8
#1823 updates for review
arporter Feb 19, 2026
41f84da
Merge branch 'master' into 1823_transform_inlined_kerns_only
arporter Feb 19, 2026
b68c40d
#1823 rm _modified property of CodedKern
arporter Feb 19, 2026
192c0f7
#1823 update kernel-transformation documentation
arporter Feb 20, 2026
63bd9a9
#1823 fix tabs and update command-line help message
arporter Feb 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 0 additions & 31 deletions doc/developer_guide/APIs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -949,37 +949,6 @@ exchange before the loop) or add existing halo exchanges after a loop
(as an increase in depth will only make it more likely that a halo
exchange is no longer required after the loop).

Kernel Transformations
++++++++++++++++++++++

Since PSyclone is invoked separately for each Algorithm file in an
application, the naming of the new, transformed kernels is done with
reference to the kernel output directory. All transformed kernels (and
the modules that contain them) are re-named following the PSyclone
Fortran naming conventions (:ref:`lfric-conventions`). This enables the
reliable identification of transformed versions of any given kernel
within the output directory.

If the "multiple" kernel-renaming scheme is in use, PSyclone simply
appends an integer to the original kernel name, checks whether such a
kernel is present in the output directory and if not, creates it. If a
kernel with the generated name is present then the integer is
incremented and the process repeated. If the "single" kernel-renaming
scheme is in use, the same procedure is followed but if a matching
kernel is already present in the output directory then the new kernel
is not written (and we check that the contents of the existing kernel
are the same as the one we would create).

If an application is being built in parallel then it is possible that
different invocations of PSyclone will happen simultaneously and
therefore we must take care to avoid race conditions when querying the
filesystem. For this reason we use ``os.open``::

fd = os.open(<filename>, os.O_CREAT | os.O_WRONLY | os.O_EXCL)

The ``os.O_CREATE`` and ``os.O_EXCL`` flags in combination mean that
``open()`` raises an error if the file in question already exists.

Colouring
+++++++++

Expand Down
35 changes: 25 additions & 10 deletions doc/developer_guide/transformations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,21 +48,36 @@ Transformations
Kernel Transformations
======================

PSyclone is able to perform kernel transformations by obtaining the PSyIR
representation of the kernel with:
In order to transform a PSyKAl Kernel while applying transformations to a
generated PSy layer, the kernel routine must first be brought into the
same source module as the PSy-layer subroutine from which it is called.
This is achieved using ``KernelModuleInlineTrans``:

.. autoclass:: psyclone.domain.common.transformations.KernelModuleInlineTrans
:noindex:

Once the PSy-layer has its own, private copy of the Kernel, it may
subsequently be transformed.

.. note:: Currently ``KernelModuleInlineTrans`` does not support re-naming
the in-lined Kernel routine. This means that *all* calls to that
Kernel in that source file are updated so as to call the same,
local copy. #2846 will lift this limitation.

To transform a kernel, one must first obtain its PSyIR with:

.. automethod:: psyclone.psyGen.CodedKern.get_callees
:no-index:

The result of `psyclone.psyGen.Kern.get_callees` is a list of
`psyclone.psyir.nodes.KernelSchedule` objects. `KernelSchedule` is a
specialisation of the `Routine` class with the `is_program` and `return_type`
properties set to False` and `None`, respectively.
The result of ``psyclone.psyGen.Kern.get_callees`` is a list of
``psyclone.psyir.nodes.KernelSchedule`` objects. ``KernelSchedule`` is a
specialisation of the ``Routine`` class with the ``is_program`` and
``return_type`` properties set to ``False`` and ``None``, respectively.

In addition to modifying the kernel PSyIR with the desired transformations,
the `modified` flag of the `CodedKern` node has to be set. This will let
PSyclone know which kernel files it may have to rename and rewrite
during the code generation.
.. note:: A Kernel can of course be transformed independently of constructing
a PSy layer by running PSyclone on the source file and treating it
as generic Fortran rather than a DSL Kernel. This is a matter for an
application's build system.

Raising Transformations
=======================
Expand Down
8 changes: 4 additions & 4 deletions doc/developer_guide/working_practises.rst
Original file line number Diff line number Diff line change
Expand Up @@ -153,12 +153,12 @@ fortran_writer Provides a Fortran PSyIR back-end object to convert PSyIR
have_graphviz True if the Python bindings to the graphviz package (used when
generating DAG visualisations) are available. Does *not* check
that the underlying graphviz library is installed.
kernel_outputdir Sets the output directory used by PSyclone for transformed
kernel_outputdir Sets the output directory used by PSyclone for generated
kernels to be `tmpdir` (a built-in pytest fixture) and then
returns `tmpdir`. Any test that directly or indirectly causes
kernels to be transformed needs to use this fixture in order
to avoid having unwanted files created within the git working
tree.
OpenCL versions of kernels to be created must use this fixture
in order to avoid having unwanted files created within the git
working tree.
parser Creates an fparser2 parser for the Fortran2008 standard. This
is an expensive operation so this fixture is only run once
per test session.
Expand Down
46 changes: 7 additions & 39 deletions doc/user_guide/psyclone_command.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ by the command:
[-l {off,all,output}] [-p {invokes,routines,kernels}]
[-o OUTPUT_FILE] [-api DSL] [-oalg OUTPUT_ALGORITHM_FILE] [-opsy OUTPUT_PSY_FILE]
[-okern OUTPUT_KERNEL_PATH] [-dm] [-nodm]
[--kernel-renaming {multiple,single}]
[--log-level {OFF,DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--log-file LOG_FILE]
[--keep-comments] [--keep-directives] [--keep-conditional-openmp-statements]
[-I INCLUDE] [-d DIRECTORY]
Expand Down Expand Up @@ -98,12 +97,10 @@ by the command:
-opsy OUTPUT_PSY_FILE
(psykal mode) filename of generated PSy-layer code
-okern OUTPUT_KERNEL_PATH
(psykal mode) directory in which to put transformed kernels, default
(psykal mode) directory in which to put any generated kernels, default
is the current working directory
-dm, --dist_mem (psykal mode) generate distributed memory code
-nodm, --no_dist_mem (psykal mode) do not generate distributed memory code
--kernel-renaming {multiple,single}
(psykal mode) naming scheme to use when re-naming transformed kernels
--log-level {OFF,DEBUG,INFO,WARNING,ERROR,CRITICAL}
sets the level of the logging (defaults to OFF).
--log-file LOG_FILE sets the output file to use for logging (defaults to stderr).
Expand Down Expand Up @@ -389,6 +386,8 @@ For example the following command will generate GOcean PSyKAl code with DM:
See :ref:`psyclone usage for PSyKAl <psykal_usage>` section for more information
about how to use PSyKAl DSLs.

.. _psykal-file-output:

PSyKAl file output
^^^^^^^^^^^^^^^^^^

Expand All @@ -402,14 +401,11 @@ the algorithm code will be output to the terminal:

psyclone -opsy psy.f90 algorithm.f90

If PSyclone is being used to transform Kernels then the location to
write these to is specified using the ``-okern <directory>``
If PSyclone is being used to generate OpenCL Kernels (see :ref:`opencl`) then
the location to write these to is specified using the ``-okern <directory>``
option. If this is not supplied then they are written to the current
working directory. By default, PSyclone will overwrite any kernel of
the same name in that directory. To change this behaviour, the user
can use the ``--no_kernel_clobber`` option. This causes PSyclone to
re-name any transformed kernel that would clash with any of those
already present in the output directory.
working directory. Either way, PSyclone will ensure that unique filenames
are used.

Algorithm files with no invokes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -489,34 +485,6 @@ specified directory:
only one instance of the specified file within (or below) the
specified directories.

Transforming PSyKAl Kernels
^^^^^^^^^^^^^^^^^^^^^^^^^^^

When transforming kernels there are two use-cases to consider:

1. a given kernel will be transformed only once and that version
then used from multiple, different Invokes and Algorithms;
2. a given kernel is used from multiple, different Invokes and
Algorithms and is transformed differently, depending on the
Invoke.

Whenever PSyclone is used to transform a kernel, the new kernel must
be re-named in order to avoid clashing with other possible calls to
the original. By default (``--kernel-renaming multiple``), PSyclone
generates a new, unique name for each kernel that is
transformed. Since PSyclone is run on one Algorithm file at a time, it
uses the chosen kernel output directory (``-okern``) to ensure that
names created by different invocations do not clash. Therefore, when
building a single application, the same kernel output directory must
be used for each separate invocation of PSyclone.

Alternatively, in order to support use case 1, a user may specify
``--kernel-renaming single``: now, before transforming a kernel,
PSyclone will check the kernel output directory and if a transformed
version of that kernel is already present then that will be
used. Note, if the kernel file on disk does not match with what would
be generated then PSyclone will raise an exception.

Enabling the Logging Infrastructure
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
25 changes: 14 additions & 11 deletions doc/user_guide/transformations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -175,9 +175,9 @@ alphabetical order below (a number of these have specialisations which
can be found in the API-specific sections).

.. note:: PSyclone currently only supports OpenCL and
KernelImportsToArguments transformations for the GOcean 1.0
KernelImportsToArguments transformations for the GOcean
API, the OpenACC Data transformation is limited to
the generic code transformation and the GOcean 1.0 API and the
the generic code transformation and the GOcean API and the
OpenACC Kernels transformation is limited to the generic code
transformation and the LFRic API.

Expand Down Expand Up @@ -774,6 +774,8 @@ caused by Taskloops, and adds OpenMP Taskwait statements to satisfy those
dependencies. An example of using OpenMP tasking is available in
`PSyclone/examples/nemo/eg1/openmp_taskloop_trans.py`.

.. _opencl:

OpenCL
------

Expand All @@ -791,9 +793,9 @@ OpenCL functionality. It also relies upon the device acceleration support
provided by the dl_esm_inf library (https://github.com/stfc/dl_esm_inf).


.. note:: The generated OpenCL kernels are written in a file called
opencl_kernels_<index>.cl where the index keeps increasing if the
file name already exist.
.. note:: The generated OpenCL kernels are written to the kernel output directory
(see :ref:`psykal-file-output`) in a file called ``opencl_kernels_<index>.cl``
where the index keeps increasing if the file name already exists.


The ``GOOpenCLTrans`` transformation accepts an `options` argument with a
Expand Down Expand Up @@ -905,12 +907,13 @@ porting and/or debugging of an OpenACC application as it provides
explicit control over what data is present on a device for a given
(part of an) Invoke routine.

The NVIDIA compiler compiler provides an alternative approach to controlling
data movement through its 'managed memory' option
(``-gpu=mem:managed``). When this is enabled the compiler itself takes
on the task of ensuring that data is copied to/from the GPU when
required. (Note that this approach can struggle with Fortran code
containing derived types however.)
GPU vendors often provide an alternative approach to controlling
data movement through a 'managed memory' option (``-gpu=mem:managed`` for
NVIDIA). When this is enabled, data is copied to/from the
GPU using the operating system's page-fault mechanism.
(Note that this approach can suffer from 'thrashing' if both CPU and GPU
frequently access data held in the same page. This can be a particular
issue for Fortran code containing derived types.)

As well as ensuring the correct data is copied to and from the remote
device, OpenACC directives must also be added to a code in order to
Expand Down
4 changes: 1 addition & 3 deletions examples/gocean/eg1/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -109,8 +109,6 @@ dag:
openacc:
$(ENV) ${PYTHON} ./runme_openacc.py

# The "--kernel-renaming single" parameter avoids generating duplicate
# versions of OpenCL kernels called multiple times.
opencl:
${PSYCLONE} -nodm -s ./opencl_transformation.py --kernel-renaming single \
${PSYCLONE} -nodm -s ./opencl_transformation.py \
-api gocean -I${INF_INC} shallow_alg.f90
20 changes: 12 additions & 8 deletions examples/gocean/eg1/opencl_transformation.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,24 +36,26 @@
''' Module providing a PSyclone transformation script that converts the
Schedule of each Invoke to use OpenCL. '''

from psyclone.psyGen import TransInfo, InvokeSchedule
from psyclone.domain.gocean.transformations import GOOpenCLTrans, \
GOMoveIterationBoundariesInsideKernelTrans
from psyclone.psyGen import InvokeSchedule
from psyclone.domain.common.transformations import KernelModuleInlineTrans
from psyclone.domain.gocean.transformations import (
GOOpenCLTrans, GOMoveIterationBoundariesInsideKernelTrans)
from psyclone.psyir.nodes import FileContainer
from psyclone.transformations import KernelImportsToArguments


def trans(psyir):
def trans(psyir: FileContainer):
'''
Transformation routine for use with PSyclone. Converts any imported-
variable accesses into kernel arguments and then applies the OpenCL
transformation to the PSy layer.

:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`

'''
# Get the necessary transformations
tinfo = TransInfo()
import_trans = tinfo.get_trans_name('KernelImportsToArguments')
import_trans = KernelImportsToArguments()
mod_inline_trans = KernelModuleInlineTrans()
move_boundaries_trans = GOMoveIterationBoundariesInsideKernelTrans()
cltrans = GOOpenCLTrans()

Expand All @@ -67,9 +69,11 @@ def trans(psyir):
continue

# Remove the imports from inside each kernel and move PSy-layer
# loop boundaries inside the kernel as a mask.
# loop boundaries inside the kernel as a mask. To do this we must
# first module-inline the kernel into the PSy layer module.
for kern in schedule.kernels():
print("Update kernel: " + kern.name)
mod_inline_trans.apply(kern)
move_boundaries_trans.apply(kern)
import_trans.apply(kern)

Expand Down
8 changes: 6 additions & 2 deletions examples/gocean/eg3/ocl_trans.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,24 +39,28 @@
from psyclone.psyGen import InvokeSchedule
from psyclone.psyir.transformations import (
FoldConditionalReturnExpressionsTrans)
from psyclone.domain.common.transformations import KernelModuleInlineTrans
from psyclone.domain.gocean.transformations import (
GOOpenCLTrans, GOMoveIterationBoundariesInsideKernelTrans)
from psyclone.psyir.nodes import FileContainer


def trans(psyir):
def trans(psyir: FileContainer):
'''
Applies OpenCL to the given PSy-layer.

:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`

'''
mod_inline_trans = KernelModuleInlineTrans()
ocl_trans = GOOpenCLTrans()
fold_trans = FoldConditionalReturnExpressionsTrans()
move_boundaries_trans = GOMoveIterationBoundariesInsideKernelTrans()

# Provide kernel-specific OpenCL optimization options
for idx, kern in enumerate(psyir.kernels()):
# Kernel has to be module-inlined first.
mod_inline_trans.apply(kern)
# Move the PSy-layer loop boundaries inside the kernel as a kernel
# mask, this allows to iterate through the whole domain
move_boundaries_trans.apply(kern)
Expand Down
4 changes: 2 additions & 2 deletions examples/gocean/eg4/acc_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ def trans(psyir):
ltrans = ACCLoopTrans()
dtrans = ACCEnterDataTrans()
ktrans = ACCRoutineTrans()
itrans = KernelModuleInlineTrans()
mod_inline_trans = KernelModuleInlineTrans()
g2localtrans = KernelImportsToArguments()

schedule = psyir.children[0].children[0]
Expand All @@ -77,7 +77,7 @@ def trans(psyir):
# Convert any accesses to imported data into kernel arguments, put an
# 'acc routine' directive inside, and module-inline each kernel
for kern in schedule.coded_kernels():
mod_inline_trans.apply(kern)
if kern.name == "kern_use_var_code":
g2localtrans.apply(kern)
ktrans.apply(kern)
itrans.apply(kern)
13 changes: 8 additions & 5 deletions examples/gocean/eg4/ocl_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,22 +39,25 @@
'''

from psyclone.transformations import KernelImportsToArguments
from psyclone.domain.gocean.transformations import GOOpenCLTrans, \
GOMoveIterationBoundariesInsideKernelTrans
from psyclone.domain.common.transformations import KernelModuleInlineTrans
from psyclone.domain.gocean.transformations import (
GOOpenCLTrans, GOMoveIterationBoundariesInsideKernelTrans)
from psyclone.psyir.nodes import FileContainer


def trans(psyir):
def trans(psyir: FileContainer):
'''
Transformation routine for use with PSyclone. Applies the OpenCL
transform to the first Invoke in the psy object.
transform to the first Invoke in the PSy-layer.

:param psyir: the PSyIR of the PSy-layer.
:type psyir: :py:class:`psyclone.psyir.nodes.FileContainer`

'''
# Convert any kernel accesses to imported data into arguments
mod_inline_trans = KernelModuleInlineTrans()
ktrans = KernelImportsToArguments()
for kern in psyir.kernels():
mod_inline_trans.apply(kern)
ktrans.apply(kern)

# Provide kernel-specific OpenCL optimization options
Expand Down
Loading
Loading