Skip to content

[frontier] Add concentrated list of useful cray-mpich environment variables#1002

Open
abbotts wants to merge 2 commits intoolcf:masterfrom
abbotts:sabbott/mpi_env_vars
Open

[frontier] Add concentrated list of useful cray-mpich environment variables#1002
abbotts wants to merge 2 commits intoolcf:masterfrom
abbotts:sabbott/mpi_env_vars

Conversation

@abbotts
Copy link
Contributor

@abbotts abbotts commented Sep 24, 2025

The full list of cray-mpich environment variables can be quite intimidating for most users. This PR is an effort to pull out the ones most users should be aware of and write them in plain text.

I'll open this as a PR because we need to iterate a bit on placement, formatting, and descriptions. There's also a few that didn't make this first cut that we might want to add. In particular, these are on the shortlist but I decided to leave out but perhaps should be added back in. I feel like if we want to add these we need a more dedicated MPI debugging page.


If indicated by profiling or counters
- `FI_MR_CACHE_MAX_COUNT` - NOT max size
- `MPICH_GPU_IPC_CACHE_MAX_SIZE`
- `FI_MR_CACHE_MONITOR`

If running complex workflows:
- `MPICH_SINGLE_HOST_ENABLED`
- `MPICH_OFI_NIC_POLICY`
    - `MPICH_OFI_NIC_VERBOSE`
    - `MPICH_OFI_NIC_MAPPING`

- `FI_CXI_RX_MATCH_MODE` - can we test how much memory this uses to start in hybrid?

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Setting this environment variable to ``1`` will spawn a thread dedicated to making progress on outstanding MPI communication and automatically increase the MPI thread level to MPI_THREAD_MULTIPLE.
Applications that use one-sided MPI (eg, ``MPI_Put``, ``MPI_Get``) or non-blocking collectives (eg, ``MPI_Ialltoall``) will likely benefit from enabling this feature.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. My experiments with MPI_Get and MPI_Ialltoall seemed to work pretty well without the async thread. Maybe because I wasn't trying to overlap with heavy CPU-based computation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, @timattox and I had a discussion on this and both my recommendation (one-sided) and his (non-blocking collectives) are based on guidance we got from Krishna, but neither of us has had a chance to really test.

I'm not sure how much CPU-computation has to do with it. I think this comes down to when progress happens, and Slingshot may change some of that. Without the offloaded rendezvous the progress would only happen in a libfabric call, and that's only going to happen from an MPI call unless you have the progress thread.

The guidance in the MPICH man page is actually more broad than what we have here. It basically says "this is good for anything except blocking pt2pt".

My inclination is leave this in for now but make a point to specifically test over the next 6 months and update with what we think the right guidance is for different codes.

@abbotts abbotts force-pushed the sabbott/mpi_env_vars branch from 290e0f2 to 6133621 Compare February 6, 2026 15:19
@abbotts abbotts marked this pull request as ready for review February 6, 2026 15:24
@abbotts
Copy link
Contributor Author

abbotts commented Feb 6, 2026

Rebased to latest master. Time to bring this out of draft and work on getting it merged.

@hagertnl , @GeorgiadouAntigoni - I've had this sitting on the back burner and I'd like to make some progress towards getting this merged and moving to the other MPI and GPU aware MPI documentation we've been discussing. I'm open to any content or formatting changes here.

@hagertnl
Copy link
Contributor

hagertnl commented Feb 6, 2026

I like the current version. I get the sense that the docs need a larger re-shuffle to trim down content and better-organize "tips & tricks"-like sections, but that would best be handled in a separate PR.

Do we want to include any of the outdated workarounds that shouldn't be needed anymore, or stage those for a later update?

@abbotts
Copy link
Contributor Author

abbotts commented Feb 6, 2026

I'd like to handle the outdated workarounds in a separate PR. The COE needs to scrub the known issues/workarounds/fixed issues list to make sure we capture everything.

Unfortunately the issues with the new signal handler in cray-mpich/9.0.1 mean we can't get rid of nearly as many workarounds as I hoped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants