test(anomaly_scores): Add unit tests for dataProcess and groupComparison for anomaly score feature #173

tonywu1999 · 2025-08-26T20:59:54Z

User description

Checklist Before Requesting a Review

I have read the MSstats contributing guidelines
My changes generate no new warnings
Any dependent changes have been merged and published in downstream modules
I have run the devtools::document() command after my changes and committed the added files

PR Type

Enhancement, Tests

Description

Enforce linear summarization with anomaly scores
Pass input columns into parameter checks
Add unit tests for anomaly-weighted summarization
Add tests for weighted groupComparison modeling

Diagram Walkthrough

flowchart LR
  DP["dataProcess()"] --> CHK[".checkDataProcessParams()"]
  RAW["raw input colnames"] -- passed --> CHK
  CHK -- detects 'AnomalyScores' --> RULE["Require summaryMethod='linear'"]
  RULE -- error if not linear --> DP
  TEST1["test_dataProcess"] -- anomaly scenarios --> DP
  TEST2["test_groupComparison"] -- Variance as weights --> GC["groupComparison()"]
  TEST3["test_utils_groupComparison_model"] -- verify weights --> FIT[".fitModelForGroupComparison"]

File Walkthrough

Relevant files

Enhancement

dataProcess.R `Forward input columns to param validation` R/dataProcess.R Pass `colnames(raw)` to parameter checker Enable anomaly-aware validation in checks	+2/-1
utils_checks.R `Add anomaly-aware summaryMethod validation` R/utils_checks.R Extend `.checkDataProcessParams` signature with `input_columns` Validate `summaryMethod` when `AnomalyScores` present Emit informative error guiding to `linear` summarization	+10/-2

Tests

test_dataProcess.R `Tests for anomaly-driven linear summarization` inst/tinytest/test_dataProcess.R Add dataset with `AnomalyScores` Expect error for non-linear summarization Validate variance behavior with/without anomalies Compare high vs low anomaly variance means	+88/-1
test_groupComparison.R `Tests for weighted groupComparison using Variance` inst/tinytest/test_groupComparison.R Add protein-level inputs with/without `Variance` Compare SE and p-values with anomaly-based weights Ensure unaffected proteins remain consistent	+44/-1
test_utils_groupComparison_model.R `Verify model weights equal inverse variance` inst/tinytest/test_utils_groupComparison_model.R New test for `.fitModelForGroupComparison` Confirm weights applied as `1/Variance` Ensure weights exist when `Variance` provided	+18/-0

…son for anomaly score feature

coderabbitai · 2025-08-26T21:00:00Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch tony-feature-model_weights

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2025-08-26T21:00:56Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Error Message Clarity The new stop message for AnomalyScores enforces summaryMethod='linear'. Confirm wording and guidance fully align with package documentation and expected user workflow, and that it won’t trigger for columns named similarly or with different casing. if ("AnomalyScores" %in% input_columns) { if (summarization$method != "linear") { stop("AnomalyScores column detected in your input columns. ", "Please set summaryMethod to 'linear' to use anomaly scores for protein summarization, ", "or remove the AnomalyScores column from your input data.") } } Test Robustness Tests compare mean variances across synthetic datasets. Validate that randomness or environment differences can’t cause flaky behavior; consider tighter, deterministic assertions tied to specific proteins or exact expected values if stable. QuantDataLowAnomaly = dataProcess(msstats_input_low_anomaly, normalization=FALSE, MBimpute=TRUE, summaryMethod="linear") high_anomaly_variance = protein_data$Variance low_anomaly_variance = QuantDataLowAnomaly$ProteinLevelData$Variance expect_true(mean(high_anomaly_variance, na.rm = TRUE) > mean(low_anomaly_variance, na.rm = TRUE), info = "Mean variance should be higher for dataset with high anomaly scores") API Change Propagation .checkDataProcessParams gained a new parameter (input_columns). Ensure all call sites across the codebase are updated accordingly and that default behavior remains unchanged when AnomalyScores is absent. .checkDataProcessParams( logTrans, normalization, nameStandards, list(method = featureSubset, n_top = n_top_feature, remove_uninformative = remove_uninformative_feature_outlier), list(method = summaryMethod, equal_var = equalFeatureVar), list(symbol = censoredInt, MB = MBimpute), colnames(raw))

github-actions · 2025-08-26T21:01:18Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	Fix multi-part error message The error message concatenates multiple character strings with commas, which creates a vector and prints only the first element in stop(). Collapse into a single string to ensure the full message is shown. Also remove the double space after the first sentence fragment. R/utils_checks.R [99-105] if ("AnomalyScores" %in% input_columns) { if (summarization$method != "linear") { - stop("AnomalyScores column detected in your input columns. ", - "Please set summaryMethod to 'linear' to use anomaly scores for protein summarization, ", - "or remove the AnomalyScores column from your input data.") + stop(paste0( + "AnomalyScores column detected in your input columns. ", + "Please set summaryMethod to 'linear' to use anomaly scores for protein summarization, ", + "or remove the AnomalyScores column from your input data." + )) } } Suggestion importance[1-10]: 7 __ Why: In R, supplying multiple comma-separated strings to `stop()` creates a character vector and only the first element is used; collapsing with `paste0` ensures the entire message is emitted. This is a correct, localized improvement with modest impact.	Medium
General	Sanitize input column names Passing `colnames(raw)` directly may include `NA` or duplicate names, causing false positives or errors in downstream checks. Sanitize to a unique, non-NA character vector before passing to the validator. R/dataProcess.R [139-140] .checkDataProcessParams( logTrans, normalization, nameStandards, list(method = featureSubset, n_top = n_top_feature, remove_uninformative = remove_uninformative_feature_outlier), list(method = summaryMethod, equal_var = equalFeatureVar), list(symbol = censoredInt, MB = MBimpute), - colnames(raw)) + unique(stats::na.omit(colnames(raw))) +) Suggestion importance[1-10]: 4 __ Why: Making column names unique and non-NA could prevent edge-case issues, but the PR does not indicate such problems and `colnames()` is typically character without NAs; the change is low-risk yet low-impact.	Low

tonywu1999 · 2025-09-03T14:46:42Z

TODO: Add granular unit tests for data summarization function

test(anomaly_scores): Add unit tests for dataProcess and groupCompari…

3a184ca

…son for anomaly score feature

tonywu1999 requested a review from devonjkohler August 26, 2025 20:59

github-actions bot added the Review effort 3/5 label Aug 26, 2025

tonywu1999 added 2 commits September 5, 2025 15:57

add defaults for aft_iterations parameter to prevent Shiny from breaking

c71c16c

Add more granular unit tests for data summarization subfunctions

c140362

devonjkohler approved these changes Sep 9, 2025

View reviewed changes

devonjkohler merged commit dcdfd78 into feature-model_weights Sep 9, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(anomaly_scores): Add unit tests for dataProcess and groupComparison for anomaly score feature #173

test(anomaly_scores): Add unit tests for dataProcess and groupComparison for anomaly score feature #173

Uh oh!

tonywu1999 commented Aug 26, 2025 •

edited by github-actions bot

Loading

Uh oh!

coderabbitai bot commented Aug 26, 2025 •

edited

Loading

Review skipped

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

github-actions bot commented Aug 26, 2025

Uh oh!

tonywu1999 commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

test(anomaly_scores): Add unit tests for dataProcess and groupComparison for anomaly score feature #173

test(anomaly_scores): Add unit tests for dataProcess and groupComparison for anomaly score feature #173

Uh oh!

Conversation

tonywu1999 commented Aug 26, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

Checklist Before Requesting a Review

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

coderabbitai bot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions bot commented Aug 26, 2025

PR Reviewer Guide 🔍

Uh oh!

github-actions bot commented Aug 26, 2025

PR Code Suggestions ✨

Uh oh!

tonywu1999 commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tonywu1999 commented Aug 26, 2025 •

edited by github-actions bot

Loading

coderabbitai bot commented Aug 26, 2025 •

edited

Loading