Skip to content

Detailed outlier flags#611

Open
vergauwenthomas wants to merge 58 commits intodevfrom
detailed_outlier_flags
Open

Detailed outlier flags#611
vergauwenthomas wants to merge 58 commits intodevfrom
detailed_outlier_flags

Conversation

@vergauwenthomas
Copy link
Owner

No description provided.

Copilot AI review requested due to automatic review settings January 20, 2026 10:09
@vergauwenthomas vergauwenthomas linked an issue Jan 20, 2026 that may be closed by this pull request
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces a comprehensive refactoring of the quality control (QC) outlier tracking system, replacing a dictionary-based approach with a new QCresult class. This change enables more detailed and structured tracking of QC flags, outlier timestamps, and detailed information about why records were flagged.

Changes:

  • Introduces a new QCresult class to encapsulate QC check results with flags, outliers, and detailed information
  • Refactors all QC check functions to return QCresult objects instead of DatetimeIndex objects
  • Reorganizes the buddy check implementation into a modular structure with separate method files
  • Updates SensorData class to work with QCresult objects instead of dictionaries

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 44 comments.

Show a summary per file
File Description
src/metobs_toolkit/qcresult.py New class defining the QCresult structure with flags, outliers, and details management
src/metobs_toolkit/sensordata.py Updated outlier handling to use QCresult objects; added _update_outliers_NEW method
src/metobs_toolkit/qc_collection/grossvalue_check.py Refactored to return QCresult instead of DatetimeIndex
src/metobs_toolkit/qc_collection/persistence_check.py Refactored to return QCresult instead of DatetimeIndex
src/metobs_toolkit/qc_collection/repetitions_check.py Refactored to return QCresult instead of DatetimeIndex
src/metobs_toolkit/qc_collection/step_check.py Refactored to return QCresult instead of DatetimeIndex
src/metobs_toolkit/qc_collection/window_variation_check.py Refactored to return QCresult instead of DatetimeIndex
src/metobs_toolkit/qc_collection/duplicated_timestamp.py New function for duplicated timestamp detection returning QCresult
src/metobs_toolkit/qc_collection/invalid_check.py New function for invalid value filtering
src/metobs_toolkit/qc_collection/common_functions.py Added create_qcresult_flags helper function
src/metobs_toolkit/qc_collection/spatial_checks/buddywrapstation.py New wrapper class for stations in buddy check with detailed tracking
src/metobs_toolkit/qc_collection/spatial_checks/buddy_check.py Refactored buddy check to use new structure and return QCresult dictionary
src/metobs_toolkit/qc_collection/spatial_checks/methods/*.py Modular organization of buddy check helper methods
src/metobs_toolkit/dataset.py Updated buddy_check methods to work with new QCresult-based system

targets = records.drop(skip_records)

# Option 1: Create a outlier label for these invalid inputs,
# and treath them as outliers
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in the comment: "treath" should be "treat".

Suggested change
# and treath them as outliers
# and treat them as outliers

Copilot uses AI. Check for mistakes.
logger = logging.getLogger("<metobs_toolkit>")


@log_entry
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @log_entry decorator should not be used on the drop_invalid_values function as it is defined in invalid_check.py. Based on the coding guidelines, the @log_entry decorator should be applied to public methods in the main classes (like Dataset or SensorData), not to internal QC helper functions.

Suggested change
@log_entry

Copilot uses AI. Check for mistakes.
Comment on lines 102 to 104
qcresult = QCresult(
checkname="repetitions",
checksettings=locals().pop('records', None),
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression locals().pop('records', None) is problematic. The locals() function returns a dictionary that represents the current local symbol table, but modifying it (like with .pop()) does not affect the actual local variables and can lead to undefined behavior. This approach will not successfully exclude 'records' from the checksettings dictionary.

Instead, you should create a copy of locals() first, then modify that copy:

checksettings = locals().copy()
checksettings.pop('records', None)

Or build the dictionary explicitly with the values you want to include.

Suggested change
qcresult = QCresult(
checkname="repetitions",
checksettings=locals().pop('records', None),
# Build checksettings from local variables, excluding the data series itself
checksettings = locals().copy()
checksettings.pop("records", None)
qcresult = QCresult(
checkname="repetitions",
checksettings=checksettings,

Copilot uses AI. Check for mistakes.
Comment on lines 61 to 63
qcresult = QCresult(
checkname="gross_value",
checksettings=locals().pop('records', None),
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression locals().pop('records', None) is problematic. The locals() function returns a dictionary that represents the current local symbol table, but modifying it (like with .pop()) does not affect the actual local variables and can lead to undefined behavior. This approach will not successfully exclude 'records' from the checksettings dictionary.

Instead, you should create a copy of locals() first, then modify that copy:

checksettings = locals().copy()
checksettings.pop('records', None)

Or build the dictionary explicitly with the values you want to include.

Suggested change
qcresult = QCresult(
checkname="gross_value",
checksettings=locals().pop('records', None),
checksettings = locals().copy()
checksettings.pop('records', None)
qcresult = QCresult(
checkname="gross_value",
checksettings=checksettings,

Copilot uses AI. Check for mistakes.
Comment on lines 149 to 151
qcresult = QCresult(
checkname="persistence",
checksettings=locals().pop('records', None),
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression locals().pop('records', None) is problematic. The locals() function returns a dictionary that represents the current local symbol table, but modifying it (like with .pop()) does not affect the actual local variables and can lead to undefined behavior. This approach will not successfully exclude 'records' from the checksettings dictionary.

Instead, you should create a copy of locals() first, then modify that copy:

checksettings = locals().copy()
checksettings.pop('records', None)

Or build the dictionary explicitly with the values you want to include.

Suggested change
qcresult = QCresult(
checkname="persistence",
checksettings=locals().pop('records', None),
checksettings = locals().copy()
checksettings.pop("records", None)
qcresult = QCresult(
checkname="persistence",
checksettings=checksettings,

Copilot uses AI. Check for mistakes.
logger = logging.getLogger("<metobs_toolkit>")

from .findbuddies import filter_buddygroup_by_altitude
from .samplechecks import buddy_test_a_station
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'buddy_test_a_station' is not used.

Copilot uses AI. Check for mistakes.

qcresult = QCresult(
checkname="window_variation",
checksettings=locals().pop('records', None),
Copy link

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modification of the locals() dictionary will have no effect on the local variables.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

QC details per check

1 participant

Comments