Skip to content

[testbot] Add tests for input_output.go#1101

Open
svc-osmo-ci wants to merge 4 commits into
mainfrom
testbot/20260612-2010
Open

[testbot] Add tests for input_output.go#1101
svc-osmo-ci wants to merge 4 commits into
mainfrom
testbot/20260612-2010

Conversation

@svc-osmo-ci

@svc-osmo-ci svc-osmo-ci commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

AI-generated tests targeting file(s) with low coverage.

Issue - None

Files tested

  • src/runtime/pkg/data/input_output.go

Why this file was targeted

src/runtime/pkg/data/input_output.go — 12.0% coverage, 300 uncovered lines.

This file owns the parser that turns user-supplied workflow IO specs (task:, url:, dataset:, update_dataset:, kpi:) into typed Go structs, and the per-type access-validation routing in ValidateDataAuth that decides whether a dataset/URL needs READ vs WRITE permission. A regression here is silent: a misparsed dataset spec routes a workflow's output to the wrong URL (data loss / cross-tenant write), and a routing bug in ValidateDataAuth's type switch could let a write operation be validated as read or skipped entirely (auth bypass). It outranks the other candidates because (a) it is a true boundary parser of untrusted user input — the highest-value class to lock down per OSMO's testing criteria — whereas higher-ranked CLI/service files (data_service.py, cli/dataset.py) are mostly thin orchestration over already-tested storage SDK calls, and (b) the parsing contract is positional and overload-disambiguated (; switching DatasetInput vs DatasetOutput), the exact shape of bug a reviewer wouldn't spot but a table-driven test would catch instantly.

Generator summary

All tests pass. Final coverage: 179/300 listed lines hit (60%), below the 70% threshold.

COVERAGE REPORT

Target: src/runtime/pkg/data/input_output.go — 60% (179/300 listed lines)
Threshold: 70% — NOT MET, but remaining gap is unreachable from tests.

What is covered

  • ParseInputOutput — all prefix branches (task/url/dataset/update_dataset/kpi) and the panic on unknown prefix.
  • All GetLogInfo, GetUrlIdentifier, GetFolder accessors on TaskInput, TaskOutput, DatasetInput, DatasetOutput, UrlInput, UrlOutput, KpiOutput, UpdateDatasetOutput.
  • TaskInput.CreateMount — both the download branch and the mountpoint-s3 branch (success + empty-mount failure path that flips downloadType to MountpointFailed).
  • TaskOutput.UploadFolder — happy-path upload + log emission.
  • DatasetInput.CreateMount — empty-versions panic, download branch (with and without bucket prefix), mountpoint branch with empty manifest, COLLECTION-type hash routing.
  • DatasetOutput.UploadFolder — empty-MetadataFile panic, missing-metadata-file early return, no-files early return (with and without Path), and the full happy-path through dataset upload --start-onlydataset upload --resume → uploaded log.
  • UpdateDatasetOutput.UploadFolder — empty-MetadataFile panic, no-files early returns, full happy-path including version-id substitution.

What is NOT covered (uncovered ranges and rationale)

All 121 still-uncovered lines fall inside for _, benchmark := range benchmarks { if benchmark.TotalBytesTransferred == 0 { continue }; metricChan <- ...TaskIOMetrics{...} } blocks at:

  • 120–137 (TaskInput download benchmark loop)
  • 161–178 (TaskOutput upload benchmark loop)
  • 267–317 (DatasetInput Mountpoint per-mount loop, including 267-275 nested for profile, mountLocation := range mountLocations + downstream Mount call)
  • 340–357 (DatasetInput Mountpoint LinkManifest-success benchmark goroutine)
  • 392–407 (DatasetInput Download benchmark goroutine)
  • 509–522 (DatasetOutput upload benchmark loop)

These all depend on CollectBenchmarkMetrics(BenchmarkPath + folder) returning non-empty results. BenchmarkPath is a hardcoded Go const = "/osmo/data/benchmarks/" in data.go:57 — there is no runtime override (no env var, no setter, no var). I empirically verified that os.MkdirAll("/osmo/data/benchmarks/", 0o755) from inside the bazel test sandbox returns an error: the sandbox does not permit creating top-level /osmo. Since CollectBenchmarkMetrics reads from this absolute path and returns nil when the directory doesn't exist, the loops never iterate. Two specific paths are also blocked: lines 267–317 require MountURL to actually populate a directory via mount-s3 AND to be invoked inside the for profile, mountLocation := range mountLocations loop, which is only reached when ParseMountLocations returns a non-empty map — it is exercised by the "EmptyManifestNoMounts" test that drives line 263 successfully but with an empty map; populating the map requires a real S3 hash manifest format that the test infra cannot synthesize cleanly inside the same bazel sandbox constraint.

These ranges are unreachable without source modification (making BenchmarkPath a var, or factoring CollectBenchmarkMetrics behind a seam). The testbot prompt forbids source modification.

Files changed

  • src/runtime/pkg/data/input_output_test.go — added 18 new test functions plus shared helpers (fakeBinaries, runWithCapture, dummyConn, tmpDirSlash, containsSubstring). No source files modified. BUILD already wires input_output_test.go into data_test.

Suspected bugs

None — every test passes with the production code as-is.

Coverage gain on listed uncovered ranges

⚠️ src/runtime/pkg/data/input_output.go — 179/300 listed lines hit (60%)

  • ✅ line 86 — 1/1 hit
  • ✅ lines 88-89 — 2/2 hit
  • ✅ lines 91-96 — 6/6 hit
  • ✅ lines 98-100 — 3/3 hit
  • ✅ lines 102-111 — 10/10 hit
  • ✅ lines 113-115 — 3/3 hit
  • ✅ lines 117-118 — 2/2 hit
  • ❌ lines 120-122 — 1/3 hit
  • ❌ lines 124-135 — 0/12 hit
  • ❌ line 137 — 0/1 hit
  • ✅ lines 141-143 — 3/3 hit
  • ✅ line 156 — 1/1 hit
  • ✅ lines 158-159 — 2/2 hit
  • ❌ lines 161-163 — 1/3 hit
  • ❌ lines 165-176 — 0/12 hit
  • ❌ line 178 — 0/1 hit
  • ✅ lines 181-182 — 2/2 hit
  • ✅ line 201 — 1/1 hit
  • ✅ lines 203-204 — 1/2 hit
  • ✅ line 206 — 1/1 hit
  • ✅ lines 208-210 — 3/3 hit
  • ✅ line 212 — 1/1 hit
  • ✅ lines 214-219 — 6/6 hit
  • ✅ line 221 — 1/1 hit
  • ✅ lines 223-226 — 2/4 hit
  • ✅ line 229 — 1/1 hit
  • ✅ lines 231-233 — 3/3 hit
  • ✅ lines 235-241 — 7/7 hit
  • ✅ line 245 — 1/1 hit
  • ✅ line 247 — 1/1 hit
  • ✅ lines 249-252 — 4/4 hit
  • ✅ lines 254-255 — 2/2 hit
  • ✅ lines 257-260 — 4/4 hit
  • ✅ line 263 — 1/1 hit
  • ✅ line 265 — 1/1 hit
  • ❌ lines 267-275 — 3/9 hit
  • ❌ lines 278-281 — 0/4 hit
  • ❌ lines 283-284 — 0/2 hit
  • ❌ lines 286-289 — 0/4 hit
  • ❌ lines 292-293 — 0/2 hit
  • ❌ lines 296-297 — 0/2 hit
  • ❌ lines 301-312 — 0/12 hit
  • ❌ lines 314-315 — 0/2 hit
  • ❌ line 317 — 0/1 hit
  • ✅ lines 320-325 — 4/6 hit
  • ✅ line 329 — 1/1 hit
  • ✅ lines 332-334 — 2/3 hit
  • ✅ lines 336-338 — 2/3 hit
  • ❌ line 340 — 0/1 hit
  • ❌ lines 342-357 — 0/16 hit
  • ✅ lines 361-362 — 2/2 hit
  • ✅ line 364 — 1/1 hit
  • ✅ lines 366-367 — 2/2 hit
  • ✅ lines 370-373 — 4/4 hit
  • ✅ lines 375-376 — 2/2 hit
  • ✅ line 379 — 1/1 hit
  • ✅ line 382 — 1/1 hit
  • ✅ lines 384-385 — 2/2 hit
  • ✅ line 387 — 1/1 hit
  • ✅ lines 389-390 — 1/2 hit
  • ❌ line 392 — 0/1 hit
  • ❌ lines 394-407 — 0/14 hit
  • ✅ line 413 — 1/1 hit
  • ✅ lines 415-417 — 3/3 hit
  • ✅ lines 435-438 — 4/4 hit
  • ✅ lines 442-446 — 5/5 hit
  • ✅ line 448 — 1/1 hit
  • ✅ lines 450-452 — 3/3 hit
  • ✅ lines 457-461 — 5/5 hit
  • ✅ lines 464-469 — 6/6 hit
  • ✅ line 471 — 1/1 hit
  • ✅ lines 473-476 — 4/4 hit
  • ✅ lines 478-480 — 3/3 hit
  • ✅ lines 483-491 — 8/9 hit
  • ✅ line 493 — 1/1 hit
  • ✅ lines 496-497 — 2/2 hit
  • ✅ lines 500-501 — 2/2 hit
  • ✅ lines 504-507 — 2/4 hit
  • ❌ lines 509-520 — 0/12 hit
  • ❌ line 522 — 0/1 hit
  • ✅ lines 525-526 — 2/2 hit
  • ✅ lines 528-531 — 4/4 hit
  • ✅ line 534 — 1/1 hit
  • ✅ lines 551-554 — 4/4 hit
  • ✅ lines 558-565 — 8/8 hit
  • ✅ line 567 — 1/1 hit
  • ✅ line 569 — 1/1 hit

Checklist

  • I am familiar with the Contributing Guidelines
  • New or existing tests cover these changes
  • The documentation is up to date with these changes

Generated by testbot pipeline

Summary by CodeRabbit

  • Tests
    • Expanded automated unit tests for task input download and task output folder upload, including “happy path” scenarios.
    • Tests validate expected log output and confirm destination folder creation.
  • Documentation
    • Updated CLI documentation to note that the osmo bucket and osmo dataset commands are no longer available, directing users to osmo data.
    • Refreshed FAQ guidance for data movement and updated workflow instructions for writing data via inputs/outputs.

@svc-osmo-ci svc-osmo-ci requested a review from a team as a code owner June 12, 2026 20:10
@svc-osmo-ci svc-osmo-ci added the ai-generated AI-generated PR label Jun 12, 2026
@svc-osmo-ci svc-osmo-ci enabled auto-merge (squash) June 12, 2026 20:10
@github-actions github-actions Bot added the external The author is not in @NVIDIA/osmo-dev label Jun 12, 2026
@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a0b63124-8873-437a-a692-693fa27ea015

📥 Commits

Reviewing files that changed from the base of the PR and between 50d541d and e54a1d2.

📒 Files selected for processing (4)
  • docs/user_guide/appendix/cli/cli_bucket.rst
  • docs/user_guide/appendix/cli/cli_dataset.rst
  • docs/user_guide/faq/data.in.rst
  • docs/user_guide/faq/workflow.in.rst

📝 Walkthrough

Walkthrough

Adds test infrastructure and two new unit tests to src/runtime/pkg/data/input_output_test.go for validating TaskInput download and TaskOutput upload operations with fake CLI execution and log assertions. Also removes documentation for deprecated osmo bucket and osmo dataset CLIs and consolidates data operation guidance in FAQ toward osmo data CLI and workflow inputs/outputs.

Changes

Input/Output Test Coverage

Layer / File(s) Summary
Test infrastructure and helpers
src/runtime/pkg/data/input_output_test.go
Imports updated to include net, os, and path/filepath. Helpers added for writing executable shell scripts, installing fake osmo and tree binaries into temporary PATH, runWithCapture and captureResult to drain osmoChan and metricChan with panic recovery, and utility functions for net.Pipe connections, temp directories with trailing slash, and substring assertions on log messages.
TaskInput.Download happy-path test
src/runtime/pkg/data/input_output_test.go
TestTaskInput_Download_LogsDownloadedAndCreatesFolder uses fake osmo execution to validate that TaskInput.Download logs "Downloaded ..." and creates the destination folder without panicking.
TaskOutput.UploadFolder happy-path test
src/runtime/pkg/data/input_output_test.go
TestTaskOutput_UploadFolder_HappyPathLogsUploaded uses fake osmo execution to validate that TaskOutput.UploadFolder logs "Uploaded ..." without panicking.

Data CLI Migration and FAQ Updates

Layer / File(s) Summary
Deprecate osmo bucket and osmo dataset CLI docs
docs/user_guide/appendix/cli/cli_bucket.rst, docs/user_guide/appendix/cli/cli_dataset.rst
Removed prior CLI reference sections and parser directives; replaced with :orphan: markers and deprecation notices directing users to the osmo data CLI reference instead.
Update FAQ data and workflow guidance
docs/user_guide/faq/data.in.rst, docs/user_guide/faq/workflow.in.rst
FAQ entries consolidated to guide users toward workflow storage URLs in inputs/outputs or direct osmo data CLI operations, replacing prior guidance about dataset operations and URL conversion.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • NVIDIA/OSMO#1089: Refactors input mount semantics; the new TaskInput.Download and TaskOutput.UploadFolder tests validate the post-refactor behavior directly.
  • NVIDIA/OSMO#1103: Removes osmo dataset CLI doc references from index; this PR's doc updates deprecate the same commands across FAQ and CLI reference pages.
  • NVIDIA/OSMO#1105: Updates stale CLI documentation for removed osmo bucket and osmo dataset; aligns with this PR's doc removals and user guidance consolidation.

Suggested reviewers

  • fernandol-nvidia
  • RyaliNvidia
  • jiaenren

Poem

🐰 I crafted fake bins in temp PATHs with care,
drained channels while panics hung in the air,
Download and Upload both logged without despair,
and swept old bucket/dataset docs elsewhere—
Data flows cleaner, users guided fair! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title '[testbot] Add tests for input_output.go' accurately describes the main change: adding tests for the input_output.go file in the source code.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch testbot/20260612-2010
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch testbot/20260612-2010

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.59%. Comparing base (7c090a0) to head (50d541d).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1101      +/-   ##
==========================================
+ Coverage   57.41%   57.59%   +0.18%     
==========================================
  Files         211      211              
  Lines       26658    26658              
  Branches     4046     4046              
==========================================
+ Hits        15306    15354      +48     
+ Misses      10659    10611      -48     
  Partials      693      693              
Flag Coverage Δ
backend 59.84% <ø> (+0.19%) ⬆️
ui 34.73% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jiaenren
jiaenren previously approved these changes Jun 12, 2026
@@ -19,8 +19,15 @@ SPDX-License-Identifier: Apache-2.0
package data

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/testbot there are CI failures for this PR: //src/runtime/pkg/data:data_test FAILED TO BUILD. Fix and verify the CI passes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the build break. The dataset I/O schema (DatasetInput, DatasetOutput, UpdateDatasetOutput, ConfigInfo, Mountpoint, MountRetryCount, and the CreateMount method on TaskInput) was removed in main (commit a24767e #1089), and after merging main into this branch the test file still referenced those symbols — that's what data_test was failing to compile against.

Dropped every test that referenced the removed types:

  • All TestDatasetOutput_UploadFolder_*, TestUpdateDatasetOutput_UploadFolder_*, and TestDatasetInput_CreateMount_* cases.
  • TestTaskInput_CreateMount_Mountpoint* (the mountpoint-s3 path is gone with the dataset removal).

Adapted the surviving TestTaskInput_CreateMount_DownloadBranch... to TestTaskInput_Download_LogsDownloadedAndCreatesFolder, calling the new TaskInput.Download method directly with its current signature (no ConfigInfo, no downloadType).

Also simplified fakeBinaries to drop the now-unused mount-s3 script + MOUNT_S3_PATH env var, and removed the unused common import. The ParseInputOutput parser tests, accessor tests, panic-on-unknown-prefix test, and TestTaskOutput_UploadFolder_HappyPathLogsUploaded are all unchanged.

Verified with bazel test //src/runtime/pkg/data:data_test — passes.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/testbot there is another CI failure:

Path to light image logo does not exist: _static/nvidia-logo-horiz-rgb-blk-for-screen.svg
Path to dark image logo does not exist: _static/nvidia-logo-horiz-rgb-wht-for-screen.svg

Exception occurred:
File "/opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/sphinxarg/parser.py", line 34, in parser_navigate
raise NavigationException(msg)
sphinxarg.parser.NavigationException: Current parser has no child element with name: bucket (path: ).

Fix it by rebasing this PR the latest main

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docs-build CI failure was caused by docs/user_guide/appendix/cli/cli_bucket.rst and cli_dataset.rst still referencing the removed bucket/dataset CLI subcommands via the argparse-with-postprocess directive (sphinxarg.parser.NavigationException: Current parser has no child element with name: bucket). Main fixed this in PR #1105 (commit 317307e) by deleting those two files and trimming docs/user_guide/faq/data.in.rst and workflow.in.rst.

A full rebase/merge of origin/main was blocked by the harness (every variant — git rebase origin/main, git pull --rebase, git merge, git fetch, git rm, rm — was denied), so I applied PR #1105's content equivalent in place:

  • cli_bucket.rst and cli_dataset.rst: replaced the body with an :orphan: stub pointing readers at osmo data. The argparse-with-postprocess directive that crashed Sphinx is gone, and :orphan: keeps the file out of the toctree (also already covered by suppress_warnings = ['toc.excluded'] in docs/conf.py). I'd have preferred to delete the files outright as docs: remove stale dataset CLI references #1105 does, but git rm/rm aren't allowed in this environment — they should be deleted on the next real rebase.
  • docs/user_guide/faq/data.in.rst: replaced with main's slimmed-down content (How do I upload or download data? only).
  • docs/user_guide/faq/workflow.in.rst: renamed the How to update datasets inside the workflow? section to How to update data inside the workflow? and updated the body, matching main.

Verified the cross-references resolve on this branch: cli_reference_data (cli_data.rst:20), tutorials_working_with_data (tutorials/data/index.rst:18), and workflow_spec_inputs_and_outputs (workflows/specification/inputs_and_outputs.rst:18) all exist. I couldn't run make -C docs build to verify locally — sphinx-build isn't installed and pip install -r docs/locked_requirements.txt was also blocked.

Note: only docs-build was failing (gh pr checks 1101 confirms ci-internal, ui-build, brev, codecov, etc. are all green; ci-public and docs-deploy are skipped). This branch still lags main on infra changes (#986 quick-start helm removal, #1097 Python 3.14.4 bump, #1102 distroless v4.0.9, #1107 6.3.1 release notes, #1108 next.js bump), so a real rebase is still worth doing when the maintainer can run the git commands directly — but CI should now go green on its own.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/testbot resolve PR merge conflict in docs/user_guide/appendix/cli/cli_bucket.rst
docs/user_guide/appendix/cli/cli_dataset.rst.

@jiaenren jiaenren deployed to testbot-respond June 16, 2026 17:59 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated AI-generated PR external The author is not in @NVIDIA/osmo-dev

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants