Skip to content

Bugfix nat in buddy check#614

Merged
vergauwenthomas merged 2 commits intomasterfrom
bugfix_Nat_in_buddy_check
Feb 17, 2026
Merged

Bugfix nat in buddy check#614
vergauwenthomas merged 2 commits intomasterfrom
bugfix_Nat_in_buddy_check

Conversation

@vergauwenthomas
Copy link
Owner

This pull request updates the version of the MetObs-toolkit package and makes a small but important change to the logic for synchronizing time series in the buddy check module. The update ensures that the origin and closing timestamps are determined directly from the earliest and latest values across all input series, without additional rounding.

Version bump:

  • Updated the package version from 1.0.0 to 1.0.1 in both pyproject.toml and src/metobs_toolkit/settings_collection/version.py to reflect the new release. [1] [2]

Bug fix / Logic improvement:

  • In src/metobs_toolkit/qc_collection/buddy_check.py, changed the calculation of origin and closing in the synchronize_series function to use the exact earliest and latest timestamps across all input series, removing the previous rounding logic. Also updated the corresponding docstring for clarity. [1] [2]

Copilot AI review requested due to automatic review settings February 17, 2026 08:31
@vergauwenthomas vergauwenthomas linked an issue Feb 17, 2026 that may be closed by this pull request
@vergauwenthomas vergauwenthomas merged commit 91e9afc into master Feb 17, 2026
13 checks passed
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR bumps MetObs-toolkit to 1.0.1 and updates the buddy-check time series synchronization logic to derive the synchronization window from the raw earliest/latest timestamps across input series (removing floor/ceil rounding).

Changes:

  • Bump package version from 1.0.0 to 1.0.1 in packaging metadata and internal version constant.
  • Update synchronize_series() to compute origin/closing directly from min/max timestamps (no rounding) and adjust the docstring accordingly.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
src/metobs_toolkit/settings_collection/version.py Updates internal __version__ to 1.0.1.
pyproject.toml Updates project version to 1.0.1 for release metadata.
src/metobs_toolkit/qc_collection/buddy_check.py Changes synchronization window calculation and updates related documentation.

Comment on lines 61 to 62
# Create target datetime axes
target_dt = pd.date_range(start=origin, end=closing, freq=trg_freq)
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the raw earliest/latest timestamps as start/end for pd.date_range can create a target axis that is not aligned to trg_freq (it becomes anchored on the earliest timestamp). With the default instantaneous_tolerance of 4 minutes, this can cause systematic timestamp shifts or even drop otherwise valid observations when stations are on a different (but regular) grid. Consider keeping the target axis aligned to trg_freq (e.g., floor/ceil using a proper pandas offset) while still computing/reporting the unrounded min/max separately if needed.

Suggested change
# Create target datetime axes
target_dt = pd.date_range(start=origin, end=closing, freq=trg_freq)
# Align the target datetime axis to the inferred frequency by
# flooring/ceiling the global min/max timestamps to trg_freq.
origin_aligned = origin.floor(trg_freq)
closing_aligned = closing.ceil(trg_freq)
# Create target datetime axes
target_dt = pd.date_range(start=origin_aligned, end=closing_aligned, freq=trg_freq)

Copilot uses AI. Check for mistakes.
Comment on lines +34 to 35
* closing: the latest timestamp found across all input series.

Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring now states that closing is “the latest timestamp found across all input series”, but pd.date_range(start=origin, end=closing, freq=trg_freq) will not necessarily include closing unless it lies exactly on the generated grid. This can make the documented definition of the target timestamps inaccurate and may hide edge cases where the last observations fall outside the generated index/tolerance window. Either clarify the wording (upper bound, not guaranteed included) or adjust the index generation to guarantee inclusion of the final boundary.

Suggested change
* closing: the latest timestamp found across all input series.
* closing: an upper time bound given by the latest timestamp found
across all input series (not necessarily itself included in the
generated target index).

Copilot uses AI. Check for mistakes.
Comment on lines +58 to +59
origin = min([s.index.min() for s in series_list])
closing = max([s.index.max() for s in series_list])
Copy link

Copilot AI Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change alters the time-synchronization behavior in a way that can be sensitive to small instantaneous_tolerance values (default 4 minutes). There are buddy check tests, but none appear to cover a regression where one series starts/ends off the nominal trg_freq grid (or where closing is not on-grid), which is exactly where this bugfix/behavior change matters. Adding a focused unit/regression test for synchronize_series (or a buddy_check integration test) with misaligned start/end timestamps would help prevent future regressions.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Buddy check bug --> outlier timestamps are Nan

1 participant

Comments