Skip to content

WIP: Validate resource stats match real usage#403

Draft
asmacdo wants to merge 5 commits intomainfrom
claude/validate-resource-stats-uK9NV
Draft

WIP: Validate resource stats match real usage#403
asmacdo wants to merge 5 commits intomainfrom
claude/validate-resource-stats-uK9NV

Conversation

@asmacdo
Copy link
Copy Markdown
Member

@asmacdo asmacdo commented Apr 4, 2026

🤖 Generated with Claude Code — may contain slop, wait for @asmacdo to review

Summary

  • Add integration tests that run programs with known, predictable resource usage and verify duct's measurements match reality (not just aggregation math with synthetic data)
  • Single-process tests: memory allocation detected in peak_rss, wall clock accuracy, idle vs busy CPU
  • Child/forked process tests: spawn N children each allocating M MB, verify total RSS reflects the sum, per-process stats track individual children, and totals equal sum of per-process values
  • Add test/data/memory_children.py helper using multiprocessing.Process

Test plan

  • All 9 new tests pass locally (test/duct_main/test_resource_validation.py)
  • CI passes
  • @asmacdo reviews for suitability and flakiness tolerance

https://claude.ai/code/session_01FYRR4Y8PzNLBU344ovdDS5

claude and others added 3 commits April 4, 2026 15:31
Existing tests verify aggregation math with synthetic data and basic
sanity bounds, but don't confirm duct's measurements match actual
resource usage. These tests run programs with known, predictable
resource consumption and assert duct reports values within expected
bounds:

- Memory allocation: allocate N MB, verify peak_rss >= N MB
- Wall clock time: sleep N seconds, verify reported time ~= N
- Idle CPU: sleep reports near-zero CPU
- CPU intensive: busy-loop reports significant CPU
- Usage samples: verify JSONL structure and multiple reports
- Consistent memory: held allocation visible across samples

https://claude.ai/code/session_01FYRR4Y8PzNLBU344ovdDS5
Existing e2e tests verify duct counts child PIDs correctly, but never
check that resource stats are actually aggregated across children. These
new tests spawn multiple child processes with known memory allocations
and verify:

- Total RSS reflects sum across all children (N x M MB)
- Individual child processes appear in usage.jsonl with correct RSS
- total_rss in each sample equals the sum of per-process RSS values

Adds test/data/memory_children.py helper that uses multiprocessing to
fork N children each holding M MB for a specified duration.

https://claude.ai/code/session_01FYRR4Y8PzNLBU344ovdDS5
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 91.87%. Comparing base (2162da9) to head (c1c0432).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #403   +/-   ##
=======================================
  Coverage   91.87%   91.87%           
=======================================
  Files          15       15           
  Lines        1120     1120           
  Branches      139      139           
=======================================
  Hits         1029     1029           
  Misses         69       69           
  Partials       22       22           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

claude added 2 commits April 4, 2026 15:53
- Use assert to reference bytearray instead of _ prefix (F841)
- Break long f-string debug line into separate variable (B950)

https://claude.ai/code/session_01FYRR4Y8PzNLBU344ovdDS5
json.loads() returns Any, so return type annotations of dict/list[dict]
trigger mypy's no-any-return check. Use Any instead.

https://claude.ai/code/session_01FYRR4Y8PzNLBU344ovdDS5
@asmacdo asmacdo added the semver-tests Add or improve existing tests label Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

semver-tests Add or improve existing tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants