Conversation
…nfiguration to compile skeleton.
|
This PR only adds a command-line build system using Fab for the Skeleton apps. It is not at all integrated into cylc or any other test suite. Tests have been added to the infrastructure/build/fab scripts which cover the newly added infrastructure files there. Some unit tests are based on some AI input to setup the frame work, but have been manually tweaked to ensure code coverage. Documentation is in form of a README in the above directory, please let me know if you want me to add more elsewhere. |
MatthewHambley
left a comment
There was a problem hiding this comment.
I note that a lot of requests made in the original review on SRS have not been addressed so for convenience I have repeated them here.
There was a problem hiding this comment.
Generic build components should be in lfric_build at the top level, not infrastructure/build.
There was a problem hiding this comment.
All tools, and nearly all makefiles and PSyclone tools reside in core/infrastructure, while lfric_build is a basically empty directory (containing an unused scripts and tests for it). Wouldn't it be easier for all developers to maintain that mapping (which also keeps the root directory one directory cleaner)?
Is there a document outlining these 'should be' statements? I get caught up frequently by something that is envisioned to work differently, which causes me a lot of wasted time. And I would also like to understand the reasoning behind some of these rules.
There was a problem hiding this comment.
The build tooling is where it is due to it all being in Gung-ho in the early days. When the infrastructure was calved off, the build tooling went with it because it was not Gung-ho specific. It is also not infrastructure specific but has not been moved due to resourcing constraints.
Now we are moving to a new build system, we can put the generic build tooling where it belongs, at the top level. This process has been started by creating a directory for it and adding a module for functionality we knew we wanted.
There is no document because the restructuring of the core repository is still under discussion. We do know, however, that we want the generic build support at the top level. I include the most recent diagram of our thinking regarding the rest of the layout. This is liable to change in detail, but the broad brush strokes are correct.
Given that the only change required is to move the files, very little effort has been wasted. The mechanism through which to we find out about these plans is through review.
There was a problem hiding this comment.
Thanks, the context really helped, appreciated!
There was a problem hiding this comment.
Generic build components should be in lfric_build at the top level, not infrastructure/build.
There was a problem hiding this comment.
Same question: why. Templaterator, ... are all what I would call generic build components, and the are under infrastructure/build. Why make it more confusing for the user?
There was a problem hiding this comment.
Yes, but they are in the wrong place too.
There was a problem hiding this comment.
OK, I didn't know that this kind of refactoring was the goal. If all these should should be moved into the build directory, it makes sense. I will do this as the final commit before resubmitting this.
|
|
||
| :returns List[str]: list of all supported compiler profiles. | ||
| ''' | ||
| return ["full-debug", "fast-debug", "production", "unit-tests"] |
There was a problem hiding this comment.
There's a general problem with there being nothing magical or required about "full debug" or any of the other "profiles." However, specifically there are also integration tests which don't seem to have been covered.
There was a problem hiding this comment.
Not sure what you mean here. It is the main idea that there is nothing magical, allowing sites to add their own configs (e.g. memory-leack-check, performance-check).
There was a problem hiding this comment.
That is certainly a point, but also unit and integration tests don't have profiles. They always use the same set of arguments.
There was a problem hiding this comment.
OK, but still, tests should have some well defined flags. I see two options:
- They use the same flags used in the actual build. Then we don't need a separate set of test-options
- There are dedicated options for testing, which might be different from the main executable.
In general, I would suggest 1. (i.e. testing will be compiled in the same fab-repository), and this would make sure that the right code is tested. But if tests depend on compiler optimisation (I never looked into details), then we would need a dedicated mode for tests, to consistently test the right thing.
| def setup_cray(self, build_config: BuildConfig) -> None: | ||
| ''' | ||
| This method sets up the Cray compiler and linker flags. | ||
| For now call an external function, since it is expected that | ||
| this configuration can be very lengthy (once we support | ||
| compiler modes). | ||
|
|
||
| :param build_config: the Fab build configuration instance | ||
| :type build_config: :py:class:`fab.BuildConfig` | ||
| ''' | ||
| setup_cray(build_config, self.args) |
There was a problem hiding this comment.
Having a function called setup_cray which calls a function called setup_cray can only ever cause confusion. That goes for all the others. Why does this facade exist?
There was a problem hiding this comment.
A few reasons, one being pep8 support. If we copied all these files into one config file, it would grow very big (close to 1000 lines, which is afaik a pep8 limit). Even if it's under, having big files make it much harder to manage scripts. For example, if you search for a ifort command line option, you might end up in the ifx section of the script - which you might not be able to see on the screen.
It also makes it easier to setup a site-configuration (it makes it very obvious which files you need if you want to start from scratch). And it kind of simulate current makefile system as well, where you have one config/make file per compiler suite.
One option we could use would be to rewrite them as mixin. I didn't to it this way since that would potentially create havoc if a user adds other (existing) methods to these scripts as mixin, then python's resolution order would apply, and ... chaotic things would happen.
There was a problem hiding this comment.
The presence of these confusingly named duplicate functions is a smell. It suggests a problem with the design. Maybe the concept of site configurations would benefit from further consideration.
There was a problem hiding this comment.
I am not sure what you mean with duplicate functions. The method names (setup_cray(self)) and functions called (setup_cray())? I assume this is what you mean, and will rename the latter to setup_cray_script().
We could remove the setup_cray(self) method, but the advantage of having a method in between is that a site that has a few different compilers available, can overwrite these setup methods to do customisation (i.e. each compiler is updated in its own function). ATM, I mostly set up a site in update_toolbox (since it's only one compiler in what I have done so far), so if you setup three compilers that would make for a rather longish function.
Maybe I should change this and set up compiler by overwriting the setup_xx method.
There was a problem hiding this comment.
Part of the user experience we want to have is that building is always achieved through the same procedure. Specifically, changing into a project directory and issuing a command. The same command. So please rename all build scripts to build. Note that we do not add the .py extension to executables.
There was a problem hiding this comment.
@yaswant , @t00sa , @Pierre-siddall , @cameronbateman-mo
I did not see this mentioning in any coding style. IMHO, the fafct that typical coding filename extensions are not used in LFRic is pretty ... confusing. You can tell from the extension which kind of script it is, it makes it easier for tools to do the proper highlighting (and avoids the issue that you try to view a binary :) ). I always find it confusing if I see a script and can't even be certain if it is a script in the first place. If this is indeed LFRic coding style, it should be documented to become explicit.
Additionally, esp. if you work on a base class which will affect several other derived classes, having individual names for each applications make it much easier to know where you are in your editor (since an editor tab will usually not have enough space to display the whole path). E.g. you add or modify a method in LFRicBase, and then open one derived class after another (and then need to go back to fix something else up). I often end up with half a dozen fab* scripts. If they were all called build, I would not be able to see which tab is which build script (am happy to share a screenshot of my editor).
Also, using build might give the wrong impression to users, since there is no indication that this script requires Fab to be installed. An alternative would be to use .fav as suffix, but that then also confuses editors :)
Furthermore, we are still experimenting about the best design of scripts. For example, I often use PSyclone's kernel extraction, which requires a new step to be run before PSyclone (and files to be added to the build system etc). ATM, I am using a mixin class to add the new step, and than for any apps that I want to use kernel extraction with, I just create a derived class form the Fab build of this class, add the mixin, and it all works. But I have two different build scripts then (e.g. fab_lfric_atm.py, and fab_lfric_atm_extract.py. Obviously, other designs are possible (use extraction as a command line option and have if-tests to decide in the script what steps to do). For now, I prefer the design that allows me to create a derived script (since kernel extraction is pretty niche, so I don't want other people to get confused by code that in all likelihood they will never need). Additionally, we also use this in lfric inputs, to have separate scripts that build part of the tools (and one script that builds all). Using OO design here makes it easy to avoid code duplication. But we obviously can't do that with just one name.
IMHO, the condition of 'just one name' feels like a left-over from using Makefiles :) We have the opportunity with Fab to allow to design code that's easier to understand and manage, and given that I don't see the 'one name' as an explicit requirement, I would argue that this is the better approach.
Feedback welcome.
There was a problem hiding this comment.
It is common practice to drop the extension for executables, for instance executable shell scripts do not usually have the .sh extension. It is not in any code style document because it is common practice. The extension is still used when the file is a module in a package. It doesn't cause confusion because from the user's point of view it isn't a Python (or anything else) file, it is an executable which they will execute.
Most editors I have come across which provide syntax highlighting will also interrogate the "shebang" line to inform highlighting mode. Any executable implemented using Python should have a shebang.
Yes, having multiple identically named files open can be confusing, but this will be an infrequent occurrence. Particularly after initial development. However, the key thing to bare in mind here is that this is user facing and so should favour ease of use for the user, even when that is at the expense of development ease.
The requirement of the Fab framework is a project level one and will be documented at that level. A try/except block around the import could be used to provide a fuller error message but the user will still get a "missing module" error without it.
If there is a requirement to support exotic usage such as kernel extraction then projects which require this can have a .py file somewhere within them, possibly a directory fab which can then be imported into the build script and anywhere else it is needed.
Rather than being a "left-over" we looked at the way Make has a consistent user experience and said "Yes! That's what we want for our new system too." Furthermore, it is a common approach among build, and related, tooling. See CMake or Ninja which both use a standard command and a commonly named configuration file. An advantage of this approach is to clearly signal to the user when they can build (they see the configuration file) and how to do it, the use the build command.
I note that all my comments and questions to your original reviews (as originally agreed a few months ago on MetOffice/lfric-baf#83) have not been addressed so for convenience I will repeat them here. It will take me a while to get through all your comments, but I will start adding comments now, so ideally we can discuss some of the issue and reach an agreement while I still work on other comments. |
… build system and scripts.
|
@MatthewHambley , thanks a lot for you helpful review. |
MatthewHambley
left a comment
There was a problem hiding this comment.
Only a few things left to bottom out.
There was a problem hiding this comment.
It is common practice to drop the extension for executables, for instance executable shell scripts do not usually have the .sh extension. It is not in any code style document because it is common practice. The extension is still used when the file is a module in a package. It doesn't cause confusion because from the user's point of view it isn't a Python (or anything else) file, it is an executable which they will execute.
Most editors I have come across which provide syntax highlighting will also interrogate the "shebang" line to inform highlighting mode. Any executable implemented using Python should have a shebang.
Yes, having multiple identically named files open can be confusing, but this will be an infrequent occurrence. Particularly after initial development. However, the key thing to bare in mind here is that this is user facing and so should favour ease of use for the user, even when that is at the expense of development ease.
The requirement of the Fab framework is a project level one and will be documented at that level. A try/except block around the import could be used to provide a fuller error message but the user will still get a "missing module" error without it.
If there is a requirement to support exotic usage such as kernel extraction then projects which require this can have a .py file somewhere within them, possibly a directory fab which can then be imported into the build script and anywhere else it is needed.
Rather than being a "left-over" we looked at the way Make has a consistent user experience and said "Yes! That's what we want for our new system too." Furthermore, it is a common approach among build, and related, tooling. See CMake or Ninja which both use a standard command and a commonly named configuration file. An advantage of this approach is to clearly signal to the user when they can build (they see the configuration file) and how to do it, the use the build command.
|
|
||
| :returns List[str]: list of all supported compiler profiles. | ||
| ''' | ||
| return ["full-debug", "fast-debug", "production", "unit-tests"] |
There was a problem hiding this comment.
That is certainly a point, but also unit and integration tests don't have profiles. They always use the same set of arguments.
| def setup_cray(self, build_config: BuildConfig) -> None: | ||
| ''' | ||
| This method sets up the Cray compiler and linker flags. | ||
| For now call an external function, since it is expected that | ||
| this configuration can be very lengthy (once we support | ||
| compiler modes). | ||
|
|
||
| :param build_config: the Fab build configuration instance | ||
| :type build_config: :py:class:`fab.BuildConfig` | ||
| ''' | ||
| setup_cray(build_config, self.args) |
There was a problem hiding this comment.
The presence of these confusingly named duplicate functions is a smell. It suggests a problem with the design. Maybe the concept of site configurations would benefit from further consideration.
| from templaterator import Templaterator | ||
|
|
||
|
|
||
| class LFRicBase(FabBase): |
There was a problem hiding this comment.
Zero configuration is an important feature of Fab but the fact that this class exists means it is not being used. This is "some configuration" mode.
| "point precision.") | ||
|
|
||
| group.add_argument( | ||
| '--precision-default', type=str, default=None, |
There was a problem hiding this comment.
What does this default actually mean in a world where the defaults are defined elsewhere and are not uniform. i.e. different precision bubbles default to different values.
| except ValueError: | ||
| pass |
There was a problem hiding this comment.
That was not clear from the code. Some comments to explain each attempt would help.
There was a problem hiding this comment.
Yes, but they are in the wrong place too.
There was a problem hiding this comment.
The build tooling is where it is due to it all being in Gung-ho in the early days. When the infrastructure was calved off, the build tooling went with it because it was not Gung-ho specific. It is also not infrastructure specific but has not been moved due to resourcing constraints.
Now we are moving to a new build system, we can put the generic build tooling where it belongs, at the top level. This process has been started by creating a directory for it and adding a module for functionality we knew we wanted.
There is no document because the restructuring of the core repository is still under discussion. We do know, however, that we want the generic build support at the top level. I include the most recent diagram of our thinking regarding the rest of the layout. This is liable to change in detail, but the broad brush strokes are correct.
Given that the only change required is to move the files, very little effort has been wasted. The mechanism through which to we find out about these plans is through review.
| def get_psyclone_config(self) -> List[str]: | ||
| ''' | ||
| :returns: the command line options to pick the right | ||
| PSyclone config file. | ||
| ''' | ||
| return ["--config", str(self._psyclone_config)] |
There was a problem hiding this comment.
Obviously I am not asking you to explain how OO works for every method. But when an otherwise pointless method exists, for the purpose of being over-ridden, that is worth documenting.
PR Summary
Sci/Tech Reviewer: @MatthewHambley
Code Reviewer: @t00sa
Adds a first Fab build script for Skeleton. To keep this change minimal, it's command line only (i.e. no cylc integration, which can come later).
Code Quality Checklist
Testing
trac.log
Security Considerations
Performance Impact
AI Assistance and Attribution
Documentation
PSyclone Approval
Sci/Tech Review
(Please alert the code reviewer via a tag when you have approved the SR)
Code Review