[testbot] Add tests for input_output.go by svc-osmo-ci · Pull Request #1101 · NVIDIA/OSMO

svc-osmo-ci · 2026-06-12T20:10:54Z

Summary

AI-generated tests targeting file(s) with low coverage.

Issue - None

Files tested

src/runtime/pkg/data/input_output.go

Why this file was targeted

src/runtime/pkg/data/input_output.go — 12.0% coverage, 300 uncovered lines.

This file owns the parser that turns user-supplied workflow IO specs (task:, url:, dataset:, update_dataset:, kpi:) into typed Go structs, and the per-type access-validation routing in ValidateDataAuth that decides whether a dataset/URL needs READ vs WRITE permission. A regression here is silent: a misparsed dataset spec routes a workflow's output to the wrong URL (data loss / cross-tenant write), and a routing bug in ValidateDataAuth's type switch could let a write operation be validated as read or skipped entirely (auth bypass). It outranks the other candidates because (a) it is a true boundary parser of untrusted user input — the highest-value class to lock down per OSMO's testing criteria — whereas higher-ranked CLI/service files (data_service.py, cli/dataset.py) are mostly thin orchestration over already-tested storage SDK calls, and (b) the parsing contract is positional and overload-disambiguated (; switching DatasetInput vs DatasetOutput), the exact shape of bug a reviewer wouldn't spot but a table-driven test would catch instantly.

Generator summary

All tests pass. Final coverage: 179/300 listed lines hit (60%), below the 70% threshold.

COVERAGE REPORT

Target: src/runtime/pkg/data/input_output.go — 60% (179/300 listed lines)
Threshold: 70% — NOT MET, but remaining gap is unreachable from tests.

What is covered

ParseInputOutput — all prefix branches (task/url/dataset/update_dataset/kpi) and the panic on unknown prefix.
All GetLogInfo, GetUrlIdentifier, GetFolder accessors on TaskInput, TaskOutput, DatasetInput, DatasetOutput, UrlInput, UrlOutput, KpiOutput, UpdateDatasetOutput.
TaskInput.CreateMount — both the download branch and the mountpoint-s3 branch (success + empty-mount failure path that flips downloadType to MountpointFailed).
TaskOutput.UploadFolder — happy-path upload + log emission.
DatasetInput.CreateMount — empty-versions panic, download branch (with and without bucket prefix), mountpoint branch with empty manifest, COLLECTION-type hash routing.
DatasetOutput.UploadFolder — empty-MetadataFile panic, missing-metadata-file early return, no-files early return (with and without Path), and the full happy-path through dataset upload --start-only → dataset upload --resume → uploaded log.
UpdateDatasetOutput.UploadFolder — empty-MetadataFile panic, no-files early returns, full happy-path including version-id substitution.

What is NOT covered (uncovered ranges and rationale)

All 121 still-uncovered lines fall inside for _, benchmark := range benchmarks { if benchmark.TotalBytesTransferred == 0 { continue }; metricChan <- ...TaskIOMetrics{...} } blocks at:

120–137 (TaskInput download benchmark loop)
161–178 (TaskOutput upload benchmark loop)
267–317 (DatasetInput Mountpoint per-mount loop, including 267-275 nested for profile, mountLocation := range mountLocations + downstream Mount call)
340–357 (DatasetInput Mountpoint LinkManifest-success benchmark goroutine)
392–407 (DatasetInput Download benchmark goroutine)
509–522 (DatasetOutput upload benchmark loop)

These all depend on CollectBenchmarkMetrics(BenchmarkPath + folder) returning non-empty results. BenchmarkPath is a hardcoded Go const = "/osmo/data/benchmarks/" in data.go:57 — there is no runtime override (no env var, no setter, no var). I empirically verified that os.MkdirAll("/osmo/data/benchmarks/", 0o755) from inside the bazel test sandbox returns an error: the sandbox does not permit creating top-level /osmo. Since CollectBenchmarkMetrics reads from this absolute path and returns nil when the directory doesn't exist, the loops never iterate. Two specific paths are also blocked: lines 267–317 require MountURL to actually populate a directory via mount-s3 AND to be invoked inside the for profile, mountLocation := range mountLocations loop, which is only reached when ParseMountLocations returns a non-empty map — it is exercised by the "EmptyManifestNoMounts" test that drives line 263 successfully but with an empty map; populating the map requires a real S3 hash manifest format that the test infra cannot synthesize cleanly inside the same bazel sandbox constraint.

These ranges are unreachable without source modification (making BenchmarkPath a var, or factoring CollectBenchmarkMetrics behind a seam). The testbot prompt forbids source modification.

Files changed

src/runtime/pkg/data/input_output_test.go — added 18 new test functions plus shared helpers (fakeBinaries, runWithCapture, dummyConn, tmpDirSlash, containsSubstring). No source files modified. BUILD already wires input_output_test.go into data_test.

Suspected bugs

None — every test passes with the production code as-is.

Coverage gain on listed uncovered ranges

⚠️ src/runtime/pkg/data/input_output.go — 179/300 listed lines hit (60%)

✅ line 86 — 1/1 hit
✅ lines 88-89 — 2/2 hit
✅ lines 91-96 — 6/6 hit
✅ lines 98-100 — 3/3 hit
✅ lines 102-111 — 10/10 hit
✅ lines 113-115 — 3/3 hit
✅ lines 117-118 — 2/2 hit
❌ lines 120-122 — 1/3 hit
❌ lines 124-135 — 0/12 hit
❌ line 137 — 0/1 hit
✅ lines 141-143 — 3/3 hit
✅ line 156 — 1/1 hit
✅ lines 158-159 — 2/2 hit
❌ lines 161-163 — 1/3 hit
❌ lines 165-176 — 0/12 hit
❌ line 178 — 0/1 hit
✅ lines 181-182 — 2/2 hit
✅ line 201 — 1/1 hit
✅ lines 203-204 — 1/2 hit
✅ line 206 — 1/1 hit
✅ lines 208-210 — 3/3 hit
✅ line 212 — 1/1 hit
✅ lines 214-219 — 6/6 hit
✅ line 221 — 1/1 hit
✅ lines 223-226 — 2/4 hit
✅ line 229 — 1/1 hit
✅ lines 231-233 — 3/3 hit
✅ lines 235-241 — 7/7 hit
✅ line 245 — 1/1 hit
✅ line 247 — 1/1 hit
✅ lines 249-252 — 4/4 hit
✅ lines 254-255 — 2/2 hit
✅ lines 257-260 — 4/4 hit
✅ line 263 — 1/1 hit
✅ line 265 — 1/1 hit
❌ lines 267-275 — 3/9 hit
❌ lines 278-281 — 0/4 hit
❌ lines 283-284 — 0/2 hit
❌ lines 286-289 — 0/4 hit
❌ lines 292-293 — 0/2 hit
❌ lines 296-297 — 0/2 hit
❌ lines 301-312 — 0/12 hit
❌ lines 314-315 — 0/2 hit
❌ line 317 — 0/1 hit
✅ lines 320-325 — 4/6 hit
✅ line 329 — 1/1 hit
✅ lines 332-334 — 2/3 hit
✅ lines 336-338 — 2/3 hit
❌ line 340 — 0/1 hit
❌ lines 342-357 — 0/16 hit
✅ lines 361-362 — 2/2 hit
✅ line 364 — 1/1 hit
✅ lines 366-367 — 2/2 hit
✅ lines 370-373 — 4/4 hit
✅ lines 375-376 — 2/2 hit
✅ line 379 — 1/1 hit
✅ line 382 — 1/1 hit
✅ lines 384-385 — 2/2 hit
✅ line 387 — 1/1 hit
✅ lines 389-390 — 1/2 hit
❌ line 392 — 0/1 hit
❌ lines 394-407 — 0/14 hit
✅ line 413 — 1/1 hit
✅ lines 415-417 — 3/3 hit
✅ lines 435-438 — 4/4 hit
✅ lines 442-446 — 5/5 hit
✅ line 448 — 1/1 hit
✅ lines 450-452 — 3/3 hit
✅ lines 457-461 — 5/5 hit
✅ lines 464-469 — 6/6 hit
✅ line 471 — 1/1 hit
✅ lines 473-476 — 4/4 hit
✅ lines 478-480 — 3/3 hit
✅ lines 483-491 — 8/9 hit
✅ line 493 — 1/1 hit
✅ lines 496-497 — 2/2 hit
✅ lines 500-501 — 2/2 hit
✅ lines 504-507 — 2/4 hit
❌ lines 509-520 — 0/12 hit
❌ line 522 — 0/1 hit
✅ lines 525-526 — 2/2 hit
✅ lines 528-531 — 4/4 hit
✅ line 534 — 1/1 hit
✅ lines 551-554 — 4/4 hit
✅ lines 558-565 — 8/8 hit
✅ line 567 — 1/1 hit
✅ line 569 — 1/1 hit

Checklist

I am familiar with the Contributing Guidelines
New or existing tests cover these changes
The documentation is up to date with these changes

Generated by testbot pipeline

Summary by CodeRabbit

Tests
- Expanded automated unit tests for task input download and task output folder upload, including “happy path” scenarios.
- Tests validate expected log output and confirm destination folder creation.
Documentation
- Updated CLI documentation to note that the osmo bucket and osmo dataset commands are no longer available, directing users to osmo data.
- Refreshed FAQ guidance for data movement and updated workflow instructions for writing data via inputs/outputs.

coderabbitai · 2026-06-12T20:11:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a0b63124-8873-437a-a692-693fa27ea015

📥 Commits

Reviewing files that changed from the base of the PR and between 50d541d and e54a1d2.

📒 Files selected for processing (4)

docs/user_guide/appendix/cli/cli_bucket.rst
docs/user_guide/appendix/cli/cli_dataset.rst
docs/user_guide/faq/data.in.rst
docs/user_guide/faq/workflow.in.rst

📝 Walkthrough

Walkthrough

Adds test infrastructure and two new unit tests to src/runtime/pkg/data/input_output_test.go for validating TaskInput download and TaskOutput upload operations with fake CLI execution and log assertions. Also removes documentation for deprecated osmo bucket and osmo dataset CLIs and consolidates data operation guidance in FAQ toward osmo data CLI and workflow inputs/outputs.

Changes

Input/Output Test Coverage

Layer / File(s)	Summary
Test infrastructure and helpers `src/runtime/pkg/data/input_output_test.go`	Imports updated to include `net`, `os`, and `path/filepath`. Helpers added for writing executable shell scripts, installing fake `osmo` and `tree` binaries into temporary PATH, `runWithCapture` and `captureResult` to drain `osmoChan` and `metricChan` with panic recovery, and utility functions for `net.Pipe` connections, temp directories with trailing slash, and substring assertions on log messages.
TaskInput.Download happy-path test `src/runtime/pkg/data/input_output_test.go`	TestTaskInput_Download_LogsDownloadedAndCreatesFolder uses fake osmo execution to validate that TaskInput.Download logs "Downloaded ..." and creates the destination folder without panicking.
TaskOutput.UploadFolder happy-path test `src/runtime/pkg/data/input_output_test.go`	TestTaskOutput_UploadFolder_HappyPathLogsUploaded uses fake osmo execution to validate that TaskOutput.UploadFolder logs "Uploaded ..." without panicking.

Data CLI Migration and FAQ Updates

Layer / File(s)	Summary
Deprecate osmo bucket and osmo dataset CLI docs `docs/user_guide/appendix/cli/cli_bucket.rst`, `docs/user_guide/appendix/cli/cli_dataset.rst`	Removed prior CLI reference sections and parser directives; replaced with `:orphan:` markers and deprecation notices directing users to the `osmo data` CLI reference instead.
Update FAQ data and workflow guidance `docs/user_guide/faq/data.in.rst`, `docs/user_guide/faq/workflow.in.rst`	FAQ entries consolidated to guide users toward workflow storage URLs in `inputs`/`outputs` or direct `osmo data` CLI operations, replacing prior guidance about dataset operations and URL conversion.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

NVIDIA/OSMO#1089: Refactors input mount semantics; the new TaskInput.Download and TaskOutput.UploadFolder tests validate the post-refactor behavior directly.
NVIDIA/OSMO#1103: Removes osmo dataset CLI doc references from index; this PR's doc updates deprecate the same commands across FAQ and CLI reference pages.
NVIDIA/OSMO#1105: Updates stale CLI documentation for removed osmo bucket and osmo dataset; aligns with this PR's doc removals and user guidance consolidation.

Suggested reviewers

fernandol-nvidia
RyaliNvidia
jiaenren

Poem

🐰 I crafted fake bins in temp PATHs with care,
drained channels while panics hung in the air,
Download and Upload both logged without despair,
and swept old bucket/dataset docs elsewhere—
Data flows cleaner, users guided fair! 🥕

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title '[testbot] Add tests for input_output.go' accurately describes the main change: adding tests for the input_output.go file in the source code.
Docstring Coverage	✅ Passed	Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch testbot/20260612-2010

⚔️ Resolve merge conflicts

Resolve merge conflict in branch testbot/20260612-2010

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-06-12T20:13:13Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.59%. Comparing base (7c090a0) to head (50d541d).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1101      +/-   ##
==========================================
+ Coverage   57.41%   57.59%   +0.18%     
==========================================
  Files         211      211              
  Lines       26658    26658              
  Branches     4046     4046              
==========================================
+ Hits        15306    15354      +48     
+ Misses      10659    10611      -48     
  Partials      693      693

Flag	Coverage Δ
backend	`59.84% <ø> (+0.19%)`	⬆️
ui	`34.73% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

jiaenren · 2026-06-15T18:00:30Z

@@ -19,8 +19,15 @@ SPDX-License-Identifier: Apache-2.0
 package data


/testbot there are CI failures for this PR: //src/runtime/pkg/data:data_test FAILED TO BUILD. Fix and verify the CI passes.

Fixed the build break. The dataset I/O schema (DatasetInput, DatasetOutput, UpdateDatasetOutput, ConfigInfo, Mountpoint, MountRetryCount, and the CreateMount method on TaskInput) was removed in main (commit a24767e #1089), and after merging main into this branch the test file still referenced those symbols — that's what data_test was failing to compile against.

Dropped every test that referenced the removed types:

All TestDatasetOutput_UploadFolder_*, TestUpdateDatasetOutput_UploadFolder_*, and TestDatasetInput_CreateMount_* cases.

TestTaskInput_CreateMount_Mountpoint* (the mountpoint-s3 path is gone with the dataset removal).

Adapted the surviving TestTaskInput_CreateMount_DownloadBranch... to TestTaskInput_Download_LogsDownloadedAndCreatesFolder, calling the new TaskInput.Download method directly with its current signature (no ConfigInfo, no downloadType).

Also simplified fakeBinaries to drop the now-unused mount-s3 script + MOUNT_S3_PATH env var, and removed the unused common import. The ParseInputOutput parser tests, accessor tests, panic-on-unknown-prefix test, and TestTaskOutput_UploadFolder_HappyPathLogsUploaded are all unchanged.

Verified with bazel test //src/runtime/pkg/data:data_test — passes.

/testbot there is another CI failure:

Path to light image logo does not exist: _static/nvidia-logo-horiz-rgb-blk-for-screen.svg
Path to dark image logo does not exist: _static/nvidia-logo-horiz-rgb-wht-for-screen.svg

Exception occurred:
File "/opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/sphinxarg/parser.py", line 34, in parser_navigate
raise NavigationException(msg)
sphinxarg.parser.NavigationException: Current parser has no child element with name: bucket (path: ).

Fix it by rebasing this PR the latest main

The docs-build CI failure was caused by docs/user_guide/appendix/cli/cli_bucket.rst and cli_dataset.rst still referencing the removed bucket/dataset CLI subcommands via the argparse-with-postprocess directive (sphinxarg.parser.NavigationException: Current parser has no child element with name: bucket). Main fixed this in PR #1105 (commit 317307e) by deleting those two files and trimming docs/user_guide/faq/data.in.rst and workflow.in.rst.

A full rebase/merge of origin/main was blocked by the harness (every variant — git rebase origin/main, git pull --rebase, git merge, git fetch, git rm, rm — was denied), so I applied PR #1105's content equivalent in place:

cli_bucket.rst and cli_dataset.rst: replaced the body with an :orphan: stub pointing readers at osmo data. The argparse-with-postprocess directive that crashed Sphinx is gone, and :orphan: keeps the file out of the toctree (also already covered by suppress_warnings = ['toc.excluded'] in docs/conf.py). I'd have preferred to delete the files outright as docs: remove stale dataset CLI references #1105 does, but git rm/rm aren't allowed in this environment — they should be deleted on the next real rebase.

docs/user_guide/faq/data.in.rst: replaced with main's slimmed-down content (How do I upload or download data? only).

docs/user_guide/faq/workflow.in.rst: renamed the How to update datasets inside the workflow? section to How to update data inside the workflow? and updated the body, matching main.

Verified the cross-references resolve on this branch: cli_reference_data (cli_data.rst:20), tutorials_working_with_data (tutorials/data/index.rst:18), and workflow_spec_inputs_and_outputs (workflows/specification/inputs_and_outputs.rst:18) all exist. I couldn't run make -C docs build to verify locally — sphinx-build isn't installed and pip install -r docs/locked_requirements.txt was also blocked.

Note: only docs-build was failing (gh pr checks 1101 confirms ci-internal, ui-build, brev, codecov, etc. are all green; ci-public and docs-deploy are skipped). This branch still lags main on infra changes (#986 quick-start helm removal, #1097 Python 3.14.4 bump, #1102 distroless v4.0.9, #1107 6.3.1 release notes, #1108 next.js bump), so a real rebase is still worth doing when the maintainer can run the git commands directly — but CI should now go green on its own.

/testbot resolve PR merge conflict in docs/user_guide/appendix/cli/cli_bucket.rst
docs/user_guide/appendix/cli/cli_dataset.rst.

Add AI-generated tests for input_output.go

f1f52e8

svc-osmo-ci requested a review from a team as a code owner June 12, 2026 20:10

svc-osmo-ci added the ai-generated AI-generated PR label Jun 12, 2026

svc-osmo-ci enabled auto-merge (squash) June 12, 2026 20:10

github-actions Bot added the external The author is not in @NVIDIA/osmo-dev label Jun 12, 2026

svc-osmo-ci temporarily deployed to internal-ci June 12, 2026 20:11 — with GitHub Actions Inactive

jiaenren previously approved these changes Jun 12, 2026

View reviewed changes

Merge branch 'main' into testbot/20260612-2010

3eb9e4c

jiaenren dismissed their stale review via 3eb9e4c June 13, 2026 00:42

jiaenren had a problem deploying to internal-ci June 13, 2026 00:43 — with GitHub Actions Failure

jiaenren reviewed Jun 15, 2026

View reviewed changes

jiaenren had a problem deploying to testbot-respond June 15, 2026 18:00 — with GitHub Actions Error

testbot: drop tests for removed dataset I/O types

50d541d

svc-osmo-ci temporarily deployed to testbot-respond June 15, 2026 18:14 — with GitHub Actions Inactive

svc-osmo-ci temporarily deployed to internal-ci June 15, 2026 18:14 — with GitHub Actions Inactive

jiaenren deployed to testbot-respond June 16, 2026 17:59 — with GitHub Actions Active

testbot: apply main's docs fix to clear bucket parser CI failure

e54a1d2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[testbot] Add tests for input_output.go#1101

[testbot] Add tests for input_output.go#1101
svc-osmo-ci wants to merge 4 commits into
mainfrom
testbot/20260612-2010

svc-osmo-ci commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

codecov Bot commented Jun 12, 2026 •

edited

Loading

Uh oh!

jiaenren Jun 15, 2026

Uh oh!

svc-osmo-ci Jun 15, 2026

Uh oh!

jiaenren Jun 16, 2026

Uh oh!

svc-osmo-ci Jun 16, 2026

Uh oh!

jiaenren Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -19,8 +19,15 @@ SPDX-License-Identifier: Apache-2.0
		package data

Conversation

svc-osmo-ci commented Jun 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files tested

Why this file was targeted

Generator summary

COVERAGE REPORT

What is covered

What is NOT covered (uncovered ranges and rationale)

Files changed

Suspected bugs

Coverage gain on listed uncovered ranges

Checklist

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

codecov Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jiaenren Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

svc-osmo-ci Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jiaenren Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

svc-osmo-ci Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

jiaenren Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

svc-osmo-ci commented Jun 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

codecov Bot commented Jun 12, 2026 •

edited

Loading