Next Release#2291
Open
github-actions[bot] wants to merge 25 commits into
Open
Conversation
…mplate output The helm filter had `max_lines = 40` which silently truncates output from commands like `helm template` that legitimately produce hundreds of lines of YAML manifests. This caused CI/CD pipelines to apply incomplete sets of manifests (only first ~40 lines). Other filter settings (strip_ansi, strip_lines_matching for blanks and glog warnings, truncate_lines_at for individual line length) provide sufficient output hygiene without capping total line count. Added regression test verifying multi-document YAML output is preserved in full without truncation. Fixes #1626
Replaces the stateless src/filters/mvn-build.toml (4 phases, 50-line cap) with
a Rust module under src/cmds/jvm/mvn_cmd.rs handling all common Maven lifecycle
phases with stateful Surefire/Failsafe block collapse.
Why TOML couldn't do this: the dominant noise on a healthy Maven project is
expected-exception stack traces inside passing tests (assertThrows). Collapsing
these requires deciding to drop a block based on its closing
`Tests run: N, Failures: 0, Errors: 0` line — a stateful decision the TOML DSL
cannot express.
Measured savings on synthetic full-shape fixtures (~1100 lines each, gzipped):
- mvn test: 3 422 tok → 36 tok (98.9% savings)
- mvn install: 3 460 tok → 85 tok (97.5% savings)
Coverage:
| Phase | Filter |
|------------------------------------|----------------------------------------|
| test, integration-test | filter_surefire (block collapse) |
| compile, test-compile | filter_compile (ERROR + indent + dedup)|
| package, install, verify, deploy | filter_package (compile + surefire) |
| clean, site, plugin goals, version | passthrough |
Key behaviours:
- ANSI strip first in every filter (real Maven output contains colour escapes).
- English-footer guard: if no `BUILD SUCCESS`/`BUILD FAILURE` line is present,
return ANSI-stripped raw input unchanged — protects non-English locales.
- Verbose bypass: `-X`, `--debug`, `-e`, `--errors` skip filtering.
- Stack-frame deny-list strips framework frames (`at org.junit.`, `at java.util.`,
`at sun.reflect.`, etc.); user-code frames (any other prefix) preserved.
- Duration normalisation (`Time elapsed: 2.341 s` → `Time elapsed: T s`) for
deterministic test output.
- Wrapper detection: `./mvnw` / `mvnw.cmd` via string-literal `Command::new`
(semgrep-safe); falls back to `resolved_command("mvn")`.
Hook regex changes (src/discover/rules.rs):
- Adds wrapper prefixes (./mvnw, mvnw.cmd, mvnw).
- Adds test, integration-test, verify, deploy to the alternation.
- Drops clean + site so bare invocations bypass RTK entirely (0 overhead).
- Lazy quantifier skips flags before the goal (`mvn -B clean install` now
rewrites correctly; previously failed because the regex required the goal
as the first token after `mvn`).
- Drops subcmd_savings (lazy match captures first phase, which would mis-tier
`clean install` to clean); uses flat 82.0.
Tests:
- 33 unit tests in mvn_cmd.rs (phase detection, footer guard, framework deny,
duration normalisation, ERROR continuation, WARNING dedup, install/jar lines).
- 2 inline savings assertions on gzipped synthetic full fixtures (~3 KB each)
using flate2 already in Cargo.toml. Run as part of standard `cargo test --all`.
- 12 classifier/rewrite tests in src/discover/registry.rs under new Maven
section.
Files changed:
- ADD src/cmds/jvm/mvn_cmd.rs (~810 LoC incl. tests)
- ADD src/cmds/jvm/README.md (documents whitelist omission decision)
- ADD tests/fixtures/mvn_*_raw.txt (6 inline) + 2 .gz synthetic full fixtures
- MOD src/main.rs (1 use, 1 Commands variant, 1 dispatch arm)
- MOD src/discover/rules.rs (replace mvn entry)
- MOD src/discover/registry.rs (12 classifier tests)
- MOD src/core/toml_filter.rs (drop mvn-build from expected-filters list,
adjust the 2 count assertions)
- DEL src/filters/mvn-build.toml
Fixtures are entirely synthetic — generic `com.example.app.*` package names and
`com.example:myapp` Maven coordinates. Same SHAPE as real Maven output (block
structure, ANSI codes, framework stack frames, BUILD footer, plugin banners)
without any project-identifying content.
Integrity-check whitelist: Commands::Mvn is intentionally omitted from
is_operational_command at src/main.rs:2455-2501, matching the gradle precedent
(Commands::Gradlew also omitted). The whitelist is opt-in by design per the
comment at L2452-2454 — filter modules invoked through an already-verified
hook do not need a second integrity check on their own dispatch path.
Documented in src/cmds/jvm/README.md.
Out-of-scope (follow-ups noted in module + README):
- Parallel mode `-T2C` (interleaved blocks across threads).
- Plugin goals (`mvn dependency:tree`, `mvn versions:*`).
- `mvnw.bat` legacy wrapper (Maven Wrapper 3.x emits `.cmd`).
BREAKING CHANGE: `rtk mvn <args>` output format changed. Previously: TOML
filter strips INFO/Downloading lines and caps at 50 lines. Now: state-machine
filter with Surefire block collapse + locale guard + verbose bypass. Different
output shape; significantly better savings tier; behaviour is otherwise a
superset (all previously-handled cases still handled, plus test/verify/etc.).
CLOSE regex previously required ` - in ` (single-dash, Surefire 2.x). Surefire 3.x emits ` -- in ` (double-dash), so close lines never matched on Maven 3.9 default toolchains: blocks stayed open, every new `[INFO] Running …` flushed the prior block as kept, and real-world savings dropped to ~27% on commons-cli. Changes: - CLOSE separator widened to `\s+--?\s+in ` (matches 2.x and 3.x). - CLOSE prefix widened to `INFO|ERROR|WARNING` (3.x emits WARNING for classes whose only tests are skipped). - `filter_surefire` and `filter_package` gain a `failure_trail` flag: Surefire 3.x emits the exception class and stack frames *after* the CLOSE line; keep them (stripping framework frames) until the next blank line. Without this, the failing-test signal was silently dropped. Fixtures replaced with output from a real `apache/commons-cli` build in `maven:3.9-eclipse-temurin-21`. Synthetic 2.x shape removed; 2.x compat locked by a dedicated unit test using the single-dash separator. Measured token savings on commons-cli fixtures: - `mvn test` (full): 1896 -> 38 tokens (98.0%) - `mvn install` (full): 2021 -> 78 tokens (96.1%) Refs: pszymkowiak review on PR #1956
`normalise()` rewrote real `Time elapsed: N s` and `Total time: N s` values to `T s` in the shipped output. That was only ever useful for keeping snapshot tests deterministic and never belonged in production: test perf signal is exactly the kind of data the LLM needs to spot a regression, and there are no insta snapshots in this module to protect. - Remove `normalise()` and its `TIME_DURATION` / `TOTAL_TIME` regexes. - All `filter_surefire`, `filter_compile`, `filter_package` emit sites pass the raw line through. - Renamed and inverted `surefire_normalises_durations` → `surefire_preserves_real_durations`: asserts `2.341 s` and `Total time: 4.567 s` survive filtering. Refs: pszymkowiak review on PR #1956 (E.1)
Under `mvn -q`, Maven 3.x suppresses all `[INFO]` lines: no `BUILD SUCCESS` footer, no `[INFO] Running` markers, no module banners. The existing surefire/compile/package filters all key off the English-footer guard and `[INFO] Running`/CLOSE state machine, so under `-q` they fell through and shipped the raw output unchanged (0% savings on PR #1956 reviewer's measurement). Reviewer (@pszymkowiak on PR #1956) listed `-q` handling in the path-to-merge as "worth a note or handling". This commit implements the handling. Behavior captured from `mvn -q test` on Maven 3.9.9 + Surefire 3.5.5 with JUnit 5: - Green run emits ZERO bytes — filter returns empty output. - Failure run emits ~27 lines: per-class CLOSE line, exception + stack trace, `[ERROR] Failures:` summary, aggregate `[ERROR] Tests run: N, Failures: F, ...`, and the `[ERROR] Failed to execute goal` terminator, followed by a ~10-line `[ERROR] See .../[Help 1].../Re-run Maven .../To see the full stack trace .../For more information ...` boilerplate block pointing at log files and help URLs. `filter_quiet` keeps the failure-signal lines and the user-code stack frames; drops the framework frames (existing deny-list) and the post-failure boilerplate block. Routing: `is_quiet` checks argv for `-q` or `--quiet`. When set, `run()` short-circuits the phase-based filter selection and routes non-passthrough phases to `filter_quiet`. Passthrough phases (`clean`, `site`, plugin goals) remain passthrough under `-q`. Measured savings on the captured fail fixture: 51.1% (174 -> 85 tokens). Green run: 0 -> 0 (no overhead). Safety net: unclassified `[ERROR]` lines are kept rather than dropped — better to leak a line than hide signal. Tests added (7): - `quiet_detects_short_flag` / `quiet_detects_long_flag` / `quiet_does_not_match_unrelated_flags` for `is_quiet`. - `quiet_green_run_is_empty` — empty input contract. - `quiet_fail_strips_framework_and_boilerplate` — kept-list and drop-list assertions on the real fixture. - `savings_mvn_quiet_fail` — >=50% savings target on the fixture. - `quiet_unknown_error_line_kept_as_safety_net` — unclassified `[ERROR]` not silently dropped. New fixture: `tests/fixtures/mvn_quiet_fail_raw.txt` (27 lines, 770 bytes) captured from a real `mvn -q test` run against a JUnit 5 / Surefire 3.5.5 project with one failing test.
When `mvn test` fails at the compile step before Surefire runs, the `[ERROR]` block's indented `symbol:` / `location:` / `^` continuation lines were being dropped. `filter_surefire` had no `keep_continuation` state, unlike `filter_compile:302` and `filter_package:408`. Add the missing state machine: reset in the RUNNING branch, set on each kept `[ERROR]` line (except `Tests run:` / `Failures:` / `Errors:` headers — same predicate as `filter_package:402-405`), and pass through indented continuation lines before the outside-block keep-list check. Fixture `mvn_test_compile_fail_slice_raw.txt` captures the `mvn test` slice for a project with a deliberate `cannot find symbol` in a `src/main/java` source. New tests: - `surefire_keeps_compile_continuation_on_test_phase` — the bug. - `package_still_keeps_compile_error_continuation_after_refactor` — drift guard on the `install`/`verify` path.
`filter_surefire` and `filter_package` had a ~30-line duplicated state
machine for Surefire/Failsafe per-class blocks (RUNNING / in-block
buffer / CLOSE branching / failure-trail). The duplication is what let
the P0 continuation bug land — the fix had to be applied in only one of
the two filters.
Introduce `SurefireBlock` owning that inner machine: PLUGIN_BANNER skip,
RUNNING (with prior-block flush on truncated output), in-block
buffering, CLOSE detection, and the post-close failure-trail with
framework-frame stripping.
Each `step()` returns `SurefireStep::{Consumed, FailingClose, Passthrough}`.
The `FailingClose` arm defers the actual write to the caller via
`commit_failing()` — this is the seam Commit D will use to enforce the
`mvn_max_failures` cap without duplicating the cap logic.
Outside-block keep logic stays inside each filter unchanged
(`MODULE_BANNER || keep_outside_block`, `[WARNING]` dedup, the
`keep_continuation` flag from Commit A). Byte-identical filter output —
all existing `surefire_*` / `package_*` / token-savings tests pass.
The previous free helpers `flush_block_as_keep` and `emit_block` are
removed; their behaviour now lives in `SurefireBlock::flush_open_block_as_keep`
and `SurefireBlock::commit_failing`.
In a reactor build (parent pom with `<modules>`), Maven appends a
`Reactor Summary for <root>` block at the end listing each module's
status:
[INFO] Reactor Summary for multi-module-skeleton 1.0.0-SNAPSHOT:
[INFO]
[INFO] multi-module-skeleton ...... SUCCESS [ 0.353 s]
[INFO] child-a .................... SUCCESS [ 1.676 s]
[INFO] child-b .................... FAILURE [ 0.940 s]
`filter_surefire` / `filter_package` were dropping these rows because
`keep_outside_block` only keeps `[INFO]` lines that match a specific
prefix (`Building`, `Installing`, `Total time:`, etc.). The per-module
status rows match none of them.
Add a `REACTOR_SUMMARY` regex and `reactor_summary_keep` helper that
toggles a flag on the header and clears it on `BUILD SUCCESS` /
`BUILD FAILURE`. While the flag is set the helper returns `true` so
the rows survive — including `[WARNING]` (activated-profile notices)
and the in-summary horizontal rule. Caller invokes the helper before
`keep_outside_block` so the clears-flag side effect runs regardless
of `||` short-circuit.
Fixtures captured against a hand-rolled 2-module skeleton checked in
under `tests/fixtures/multi-module-skeleton/` (parent pom + child-a
+ child-b with `Empty.java` stubs). The fail fixture is captured by
temporarily replacing child-b's `Empty.java` with one that references
a `BogusType` symbol. Reproducible from the committed skeleton.
`filter_surefire` / `filter_package` emitted unbounded output on builds with hundreds of failures — the per-class failing blocks (with their post-close exception trail) plus every `[ERROR] ClassA.test:N …` entry in the `[ERROR] Failures:` summary were all kept. Other RTK filters (git/cargo/lint) already cap with `... +N more` tails; bring mvn in line. Add `LimitsConfig::mvn_max_failures` (default 25, matching the `grep_max_per_file: 25` convention in the same struct; `0` = opt out) and wire it into two cap sites: 1. Failing-class blocks. The shared `SurefireBlock::step()` machine yields `FailingClose` once per failing class; the outer loop now commits the first N and calls a new `drop_failing()` for the rest. `drop_failing()` enables a `drop_trail` flag so the post-close per-test subline, exception, and user frames are consumed silently until the next blank line. After the loop, if any were dropped, `\n... +N more failing test classes\n` is appended. 2. `[ERROR] Failures:` summary block. New `FailuresSummaryCap` helper tracks the header, caps `[ERROR] ` entries at the same limit, and pre-emits `\n... +N more failures\n` immediately before the aggregate `[ERROR] Tests run: …` line. Tail wording follows the existing repo convention — no `[INFO]` prefix, no parenthetical hint (`cargo_cmd.rs:1035` uses the same `\n... +N more failures\n` verbatim). Tests: - `surefire_caps_failing_blocks_emits_tail` — synthetic 5-failure fixture with `cap = 3`; asserts blocks 1-3 emitted, 4-5 dropped, tail emitted. - `surefire_cap_zero_disables_capping` — same fixture with `cap = 0`; all 5 blocks kept, no tail. - `failures_summary_block_is_capped` — 5-entry summary with `cap = 3`; asserts entries 1-3 kept, 4-5 dropped, tail emitted *before* the aggregate. Behaviour-change note: the default cap of 25 means users without a custom config get a bounded output for the first time. Existing fixtures all have <25 failures so no current test diffs.
Three small reviewer-suggested cleanups:
- `[WARNING] ` prefix slicing in `filter_compile` / `filter_package`
was written defensively as `&line["[WARNING] ".len().min(line.len())..]`.
Both call sites are already gated by `line.starts_with("[WARNING]")`
so the `min` is dead defence. Use `strip_prefix(...).unwrap_or(line)`
which reads identically to the rest of the module.
- `SurefireBlock` and `SurefireStep::FailingClose` now borrow lines
from the post-ANSI-strip `String` (`Vec<&'a str>` / `Option<&'a str>`
/ `&'a str` close), tied to the `stripped` local in
`filter_surefire_with_cap` / `filter_package_with_cap`. No
`.to_string()` allocations on the hot per-line path; the slice
fixtures' savings tests are unaffected and the gzipped full-fixture
tests still hit ≥90%/≥85%.
- New `surefire_handles_crlf_line_endings` and
`package_handles_crlf_line_endings` tests assert byte-equality
between an LF-filtered fixture and a CRLF-converted-then-LF-normalised
one. Catches the class of bug where exact-equality line checks
silently miss CRLF input because `str::lines()` strips `\n` but
keeps trailing `\r`.
`develop`'s Cargo.lock has its rtk-package entry pinned at `0.36.0` while `Cargo.toml` says `0.34.3` — the lock is internally stale on develop. When `cargo build` ran on this branch it re-pinned the lock to `0.34.3` to match `Cargo.toml`, producing the diff the reviewer flagged on PR #1956. Revert the lock to `develop`'s baseline so the PR's committed diff no longer touches Cargo.lock. `cargo build` will rewrite the working-tree lock on every invocation; that's a local artifact, not part of the committed PR.
The generic AWS handler (used for subcommands outside the hardcoded
specialized list) called json_cmd::filter_json_string, which extracts
a type-only schema and discards every value.
$ rtk aws backup describe-global-settings --output json
{
GlobalSettings:
{
isCrossAccountBackupEnabled: string,
isDelegatedAdministratorEnabled: string,
isMpaEnabled: string
}
LastUpdateTime: string
}
Callers got the schema instead of the data they asked for.
Swap to json_cmd::filter_json_compact, which preserves values while
still applying depth, string-length, and array-size truncation:
$ rtk aws backup describe-global-settings --output json
{
GlobalSettings:
{
isCrossAccountBackupEnabled: "false",
isDelegatedAdministratorEnabled: "false",
isMpaEnabled: "false"
}
LastUpdateTime: "2026-05-28T09:52:17.525000+02:00"
}
Affects every AWS subcommand not on the hardcoded list (backup,
route53, kms, ssm, apigateway, ...) when output is valid JSON
(explicit `--output json` or auto-injected for describe-/list-/get-/scan).
`rtk curl <binary-url>` silently corrupted any binary download (tarballs, zip, png, pdf, ELF, ...) because the response body was captured via `String::from_utf8_lossy` in `core::stream::exec_capture`. Every non-UTF-8 byte got replaced with U+FFFD (3 bytes: 0xEF 0xBF 0xBD), so gzip magic 1f 8b 08 00 arrived at downstream consumers as 1f ef bf bd 08 00 and `tar -xzf` complained "not in gzip format". Now `rtk curl` runs `Command::output()` directly to keep stdout as `Vec<u8>`, then checks `std::str::from_utf8(&bytes).is_err()`. If the response is not valid UTF-8 (i.e. the lossy conversion would corrupt it), raw bytes are written through to stdout via `write_all` and tracking is recorded as passthrough (0% savings — token counts over binary content have no meaning). The text/JSON code path is unchanged. Verified live on 18 real binary formats (rtk's own release artifacts, ripgrep, bat, fd, hyperfine, gh, tokei, kubectl ELF 51MB, rustup-init 20MB, W3C PDF, ISO 9899 PDF 4MB, GitHub avatar PNG, Wikimedia GIF, WebP sample, MP3/MP4/WAV samples) — all byte-identical to raw curl. 10 text regression tests (JSON/HTML/Markdown/Cargo.toml/RFC) confirm the text path keeps its existing behavior. Closes #1087
Separator lines (═══, ---) and an emoji status marker cost tokens without adding signal for the LLM — RTK output must never add noise over raw. Semantic labels are kept; the emoji is swapped for plain monochrome unicode.
refacto(cmds): strip decorator noise from filter output
Surefire 3.x emits one blank-separated detail block per failing test under a single class close line. The trail previously ended at the first blank line and never re-armed, so failures after the first lost their exception message (unindented -> dropped) and leaked the full junit/jdk framework stack (indented -> kept by keep_continuation). SurefireBlock now remembers the trail's keep/drop decision when it ends at a blank line (trail_rearm) and re-enters the trail on the next per-test subline with the same decision — a capped class drops all its per-test blocks, not just the first. Any other non-blank line disarms re-entry; RUNNING/commit/drop clear it so an unrelated class can never inherit the decision. Extra blanks between per-test blocks stay armed. Per-test sublines use '<<< ERROR!' for thrown (non-assertion) exceptions — accept both markers via a shared is_per_test_subline() (also reused by filter_quiet, which already re-armed correctly), and widen the CLOSE regex to tolerate an ERROR! marker defensively (Surefire 3.5.5 emits FAILURE! even for errors-only classes, per the fixture capture; detection keys off the counts, not the marker). Fixture captured from real Maven 3.9.9 / Surefire 3.5.5 output against the new tests/fixtures/multifail-skeleton/ (CalcTest: assertion failure + thrown exception in one class; BoomTest: errors-only class). A byte-exact pin locks the single-failure path unchanged.
test (windows-latest) failed on the two CRLF tests: with no
.gitattributes, the Windows runner's core.autocrlf=true checks fixtures
out with CRLF, include_str! embeds \r\n, and the tests'
replace('\n', "\r\n") synthesis produces \r\r\n. str::lines() strips
only the final \r\n of each pair, leaving a stray \r that $-anchored
regexes (MODULE_BANNER, BUILD_FOOT) reject — dropping the module banner
and BUILD SUCCESS in the CRLF leg only. Real Maven CRLF output (single
\r\n) was always handled; only the synthesized double-CR leg broke.
Root-cause fix: tests/fixtures/** -text — fixtures are byte-exact
filter inputs; eol conversion must never touch them. Defense-in-depth:
the two CRLF tests normalize the embedded fixture back to LF before
synthesizing, so they stay correct even on checkouts without
attributes (zip downloads, pre-existing autocrlf clones).
Also fixes the test doc comment, which claimed str::lines() keeps the
trailing \r of a \r\n pair — it strips it; the hazard is the doubled
\r\r\n, not single CRLF.
filter_quiet already dropped the help block Maven emits after 'Failed to execute goal' (See ..., -> [Help 1], Re-run Maven, help URLs, bare [ERROR] dividers); the non-quiet paths kept it via the [ERROR] catch-alls — keep_outside_block for test/package and filter_compile's own [ERROR] branch. Promote QUIET_BOILER_PREFIXES to a shared BOILER_PREFIXES (+ is_boilerplate(), which also covers the bare [ERROR] divider lines, matching filter_quiet's existing drop) and reject boilerplate before each [ERROR] catch-all. The list deliberately keeps the resume hint ([ERROR] After correcting the problems... / [ERROR] mvn <args> -rf :module) — actionable signal for resuming a multi-module build — and the Failed to execute goal terminator. keep_continuation hygiene (defensive parity, not a demonstrated bug): filter_surefire only reset the flag inside its kept branch, so a dropped [ERROR] line would leave it stale and a following indented line could be wrongly kept. No committed fixture exhibits the leak; add the same end-of-loop fall-through reset filter_package already has. filter_compile resets on its new boilerplate drop path. Multifail-slice savings rise from 19.9% to 42.3%; threshold pinned at >=30% to match the reactor-fail precedent. README: document the stripping and drop the stale duration-normalisation bullet (that behaviour was removed in 77e28d0).
fix(curl): passthrough binary downloads to prevent UTF-8 corruption (#1087)
…or-unsupported-subcommands fix(aws): preserve values in JSON output for unsupported subcommands
fix(filters): remove max_lines cap from helm filter that truncates template output
…g knob Replace the per-filter LimitsConfig::mvn_max_failures field with a local const MAX_MVN_FAILING_CLASSES = CAP_WARNINGS, matching the canonical cap binding used by pytest/rspec/rake/runner. Align cap = 0 semantics with the core policy (summary-only: no blocks emitted, tail still counts) instead of the old opt-out meaning, so the future config surface lands without a special case.
Match the join_with_overflow convention (ellipsis char, '+N more' label) used by container, aws, gh and glab so the LLM sees one consistent overflow style. The tee '[full output: …]' recovery hint already comes from core tee_and_hint via RunOptions::with_tee, so capped failure output carries the canonical recovery path unchanged.
feat(mvn)!: Rust module replacing TOML filter, adds test phase support
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Feats
mvn/mvnwdespite handler existing #1537Fix
rtk curl http://....tar.gz | tar -xzffails due rtk truncation, eg "(153 more lines, 94292 bytes total)" #1087) #2181 — Closes #1087 (to verify)Other