Skip to content

test(prompt-injection): make the sanitization test able to fail#9666

Open
zied-jlassi wants to merge 1 commit into
kirodotdev:mainfrom
zied-jlassi:test/prompt-injection-oracle-can-fail
Open

test(prompt-injection): make the sanitization test able to fail#9666
zied-jlassi wants to merge 1 commit into
kirodotdev:mainfrom
zied-jlassi:test/prompt-injection-oracle-can-fail

Conversation

@zied-jlassi

Copy link
Copy Markdown

Problem

scripts/test/test-prompt-injection.ts always reported success regardless of the outcome:

if (titleChanged || bodyChanged) {
  console.log("✅ PASS - Input was sanitized\n");
  passed++;
} else {
  console.log("⚠️  INFO - Input was not modified (may be safe)\n");
  passed++;   // <- same outcome
}
  • failed is declared but never incremented.
  • The only thing checked is whether the input string changed at all, not whether the expected transformation actually happened.
  • A case whose content slipped through unchanged was still counted as a pass.

As a result failed === 0 is always true, the script always prints "All tests passed!" and always exits 0 — a regression in the sanitizer could never make this suite fail.

Fix

Give each test case explicit, checkable expectations:

  • mustNotSurvive: substrings that must be absent from the sanitized output.
  • mustAppear: substrings that must be present (e.g. [REDACTED], the truncation notice).

The runner now asserts these, increments failed on any unmet expectation, and exits non-zero — turning a no-op script into a real regression test.

Verification

  • Existing cases still pass: 8/8, exit 0.
  • With a deliberately broken (identity) sanitizer, the suite now reports 8/8 failed and exits 1 — the failure path that was previously unreachable.
  • npm run build (tsc): passes.

Scope

This PR only fixes the test oracle's correctness. It intentionally does not broaden the payload corpus; the existing cases are kept as-is.

The prompt-injection test always reported success: both branches of the
result check incremented `passed`, `failed` was never touched, and the only
condition tested was whether the input string changed at all. A case whose
injection slipped through unchanged was still counted as a pass, so a
regression in the sanitizer could never turn the suite red.

Give each case explicit expectations (substrings that must be removed and
markers that must appear) and assert them, incrementing `failed` and exiting
non-zero when an expectation is not met. The existing cases still pass; a
broken sanitizer now makes the suite fail as it should.
@zied-jlassi zied-jlassi requested review from a team as code owners June 22, 2026 14:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant