Draft: Make fof compare work with radar fof files#103
Open
muellch wants to merge 4 commits into
Open
Conversation
muellch
commented
May 21, 2026
- Strip the over-allocated padding by the real n_hdr/n_body counts (radar files allocate d_* larger than the real data), replacing the previous zero-append hack with an isel slice - Sort radar observations by the per-observation position dlat/dlon instead of the radar-station lat/lon, so the row order is deterministic - Size the fof_compare consistency gate and tolerance vector by n_body (real count) rather than the padded d_body - Flag veri_data cells that are NaN in one file but real in the other as differences instead of silently passing them - Guard against non-front-packed padding (via the dlat discriminator) and against multiple verification runs (d_veri > 1), failing loudly - Skip empty data frames in fof_compare's detailed log via log_dataframe - Add a realistic radar fixture (2-D veri_data, float fields, NaN padding) and unit/integration tests: padding strip, sort determinism, scattered-padding guard, and fof_compare consistent / not-consistent / different-padding / NaN-vs-real cases
huppd
reviewed
Jun 19, 2026
huppd
left a comment
Collaborator
There was a problem hiding this comment.
Thanks a lot for this new feature. It is minimal invasive, what I like very much.
I only have some minor comments. Final verdict should come from an expert and should be checked if it is compatible with the letkf pipeline.
Comment on lines
+391
to
+396
| # A value present (non-NaN) in one file but missing (NaN) in the other is a | ||
| # real difference -- an observation appeared or disappeared. The relative | ||
| # diff is NaN there and check_variable would treat NaN as within tolerance, | ||
| # silently passing it; force those cells to fail. Both-NaN stays NaN (and | ||
| # passes), since both being missing means they are equal. | ||
| only_one_nan = df_ref.isna() ^ df_cur.isna() |
Collaborator
There was a problem hiding this comment.
Wouldn't this be useful for every compute_rel_diff_dataframe and coud be used there?
Author
There was a problem hiding this comment.
Yes, should we move this out of the fof file only code path and into the general dataframe code path?
Collaborator
There was a problem hiding this comment.
I can't see any reason why not. Do you see any drawbacks?
Comment on lines
+56
to
+57
| n_rows_file1 = xr.open_dataset(file1_path).attrs["n_body"] | ||
| n_rows_file2 = xr.open_dataset(file2_path).attrs["n_body"] |
Collaborator
There was a problem hiding this comment.
I guess that needs to be discussed with an expert.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Make fof comparison work for radar fof files
files allocate d_* larger than the real data), replacing the previous
zero-append hack with an isel slice
of the radar-station lat/lon, so the row order is deterministic
count) rather than the padded d_body
differences instead of silently passing them
against multiple verification runs (d_veri > 1), failing loudly
unit/integration tests: padding strip, sort determinism, scattered-padding
guard, and fof_compare consistent / not-consistent / different-padding /
NaN-vs-real cases
Co-authored-by: Alina Yapparova alina.yapparova@meteoswiss.ch