Skip to content

fix: script runner misidentifies 'codex' in model names as the codex runtime binary#397

Open
ryan-niemes-helix wants to merge 1 commit intomicrosoft:mainfrom
ryan-niemes-helix:fix/issue-396-script-runner-misidentifies-codex-in-model-names
Open

fix: script runner misidentifies 'codex' in model names as the codex runtime binary#397
ryan-niemes-helix wants to merge 1 commit intomicrosoft:mainfrom
ryan-niemes-helix:fix/issue-396-script-runner-misidentifies-codex-in-model-names

Conversation

@ryan-niemes-helix
Copy link

Closes #396

Summary

  • Anchors the codex regex in _transform_runtime_command with (^|\s) so it only matches codex as a standalone command token, not as a substring of flag values like --model gpt-5.3-codex
  • Adds a regression test covering the exact scenario from the issue: copilot --allow-all-tools --model gpt-5.3-codex -p fix-issue.prompt.md

Test plan

  • New unit test test_transform_runtime_command_copilot_with_codex_in_model_name passes
  • Existing unit tests continue to pass

🤖 Generated with Claude Code

…runtime binary (microsoft#396)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@ryan-niemes-helix
Copy link
Author

@microsoft-github-policy-service agree [company="Helix, Inc"]

@microsoft-github-policy-service

@ryan-niemes-helix the command you issued was incorrect. Please try again.

Examples are:

@microsoft-github-policy-service agree

and

@microsoft-github-policy-service agree company="your company"

@danielmeppiel danielmeppiel marked this pull request as ready for review March 21, 2026 11:58
@danielmeppiel danielmeppiel self-requested a review as a code owner March 21, 2026 11:58
Copilot AI review requested due to automatic review settings March 21, 2026 11:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a script-runner command transformation bug where codex substrings inside Copilot --model values could cause the command to be rewritten as a codex exec ... invocation, and adds a regression test.

Changes:

  • Tightens the codex detection regex in _transform_runtime_command to avoid matching codex as a substring (e.g., gpt-5.3-codex).
  • Adds a unit test ensuring Copilot commands with --model gpt-5.3-codex are not mis-transformed.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/apm_cli/core/script_runner.py Adjusts codex pattern matching to reduce false positives during runtime command transformation.
tests/unit/test_script_runner.py Adds a regression test covering codex appearing inside a Copilot model name.

Comment on lines +346 to 348
if re.search(r"(^|\s)codex\s+.*" + re.escape(prompt_file), command):
match = re.search(
r"codex\s+(.*?)(" + re.escape(prompt_file) + r")(.*?)$", command
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated codex detection still matches codex anywhere in the command (as long as it’s preceded by whitespace). That means a copilot command like copilot --model codex -p hello-world.prompt.md would still hit this codex branch and be rewritten to codex exec ..., because codex -p hello-world.prompt.md satisfies the regex. Since this section is explicitly for commands without env-var prefixes, the codex transform should only trigger when codex is the leading command token (optionally after leading whitespace), and the extraction regex should be aligned to that same anchored shape.

Suggested change
if re.search(r"(^|\s)codex\s+.*" + re.escape(prompt_file), command):
match = re.search(
r"codex\s+(.*?)(" + re.escape(prompt_file) + r")(.*?)$", command
# Only trigger when 'codex' is the leading command token (optionally after whitespace)
if re.search(r"^\s*codex\s+.*" + re.escape(prompt_file), command):
match = re.search(
r"^\s*codex\s+(.*?)(" + re.escape(prompt_file) + r")(.*?)$", command

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +124
def test_transform_runtime_command_copilot_with_codex_in_model_name(self):
"""Test copilot command with --model flag containing 'codex' substring is not misidentified."""
original = "copilot --allow-all-tools --model gpt-5.3-codex -p hello-world.prompt.md"
result = self.script_runner._transform_runtime_command(
original, "hello-world.prompt.md", self.compiled_content, self.compiled_path
)
assert result == "copilot --allow-all-tools --model gpt-5.3-codex"
Copy link

Copilot AI Mar 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This regression test covers the substring case (gpt-5.3-codex), but the current codex regex would still misidentify copilot if the model value is exactly codex (e.g., copilot --model codex -p hello-world.prompt.md). Adding a test for that case will prevent future regressions and will fail with the current implementation until the codex detection is tightened to only match when codex is the actual command token.

Copilot generated this review using guidance from repository custom instructions.
@danielmeppiel
Copy link
Collaborator

Review Feedback

Thanks @ryan-niemes-helix for catching this!

Regex scope concern

The proposed pattern (^|\s)codex\s+ is better than the original but may still false-positive. For example:

copilot --model codex -p file.prompt.md

This is a valid copilot command with 'codex' as a model name, not a codex runtime invocation. The \s before 'codex' matches the space after '--model'.

The correct anchor is start-of-string only: ^codex\s+. The env-var detection path (lines 296-341) already handles CODEX_ prefix cases separately, so we don't need mid-string matching.

Same bug class in adjacent regexes

The same pattern exists for:

  • Copilot regex (~line 362): should also be ^copilot\s+
  • LLM regex (~line 381): should also be ^llm\s+
  • _detect_runtime method (~line 412-421): uses bare in substring matching

Would you be interested in fixing these in the same PR? If not, I can file a follow-up issue.

Test coverage

The added tests are good. Consider adding a test case for the false-positive scenario above (copilot --model codex ... should NOT be detected as codex runtime).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: script runner misidentifies 'codex' in model names as the codex runtime binary

3 participants