ci: add E2E integration test pipeline by Kavirubc · Pull Request #71 · similigh/simili-bot

Kavirubc · 2026-02-18T06:11:01Z

This PR adds a comprehensive E2E integration test pipeline using a temporary test repository and a bot account.

Summary by CodeRabbit

Tests
- Added an end-to-end CI workflow that runs on PRs: it creates a temporary test repository, seeds sample issues and configuration, indexes test data into a vector store, triggers a probe issue, and verifies the bot’s similarity/duplicate detection, response behavior, loop detection, and reporting. The workflow appends a summary to PR checks and performs cleanup after each run.

coderabbitai · 2026-02-18T06:12:17Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a new GitHub Actions end-to-end test workflow that provisions a temporary test repository, seeds config and issues, builds the PR-version Simili CLI to index into Qdrant, creates a trigger issue, polls/verifies the bot response (similarity & loop checks), and always cleans up.

Changes

Cohort / File(s)	Summary
E2E Test Workflow `.github/workflows/e2e-test.yml`	Adds a ~476-line GitHub Actions workflow that: validates required secrets, creates a temporary test repo and triage workflow (substituting PR SHA), propagates secrets & enables Actions, creates seed issues, builds and runs the PR-built Simili CLI to index seeds into Qdrant, waits for indexing, creates a trigger issue, polls for and verifies bot response (similarity/duplicate checks and loop detection), deletes test repo and Qdrant collection, and appends an e2e summary to PR checks.

Sequence Diagram(s)

sequenceDiagram
  participant PR as Pull Request
  participant GH as GitHub Actions (e2e workflow)
  participant TestRepo as Temporary Test Repository
  participant CLI as PR-built Simili CLI
  participant Qdrant as Qdrant Service
  participant Bot as Simili Bot
  participant Checks as PR Checks

  PR->>GH: trigger e2e-test workflow
  GH->>TestRepo: create repo, push README + simili.yaml + triage workflow (PR-SHA)
  GH->>TestRepo: propagate secrets & enable Actions
  GH->>TestRepo: create seed issues
  GH->>CLI: build PR-version CLI
  CLI->>Qdrant: index seed issues into collection
  GH->>Qdrant: wait for indexing completion
  GH->>TestRepo: create trigger issue
  Bot->>Qdrant: query for similar/duplicates
  Qdrant-->>Bot: return nearest matches
  Bot->>TestRepo: post response referencing matches
  GH->>Checks: poll and verify bot response and loop detection
  GH->>TestRepo: delete test repo
  GH->>Qdrant: delete test collection
  GH->>Checks: append e2e summary to PR

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I seeded a repo and watched the lights blink,

Five tiny issues, a CLI and a sync,
Qdrant dug up echoes, the bot gave a say,
I checked for repeats, then cleaned it away,
A hop, a nibble — the tests passed today.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: adding an E2E integration test pipeline for CI.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/e2e-test-pipeline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gh-simili-bot · 2026-02-18T06:12:18Z

Simili Triage Report

Note

Quality Score: 5.5/10 (Needs-improvement)
The issue could be improved. See suggestions below.

Classification

Category	Value
Labels

Quality Improvements

Missing detailed requirements for the E2E pipeline
Lacks design specifics (e.g., tools, temporary repo management, bot permissions)
No clear scope or success criteria for the pipeline
Missing rationale for the chosen implementation approach
Outline specific functional and non-functional requirements for the E2E pipeline
Provide a high-level design document or description, detailing the technologies, temporary repository lifecycle, and bot account responsibilities
Define the scope of the initial pipeline implementation and how success will be measured
Explain the advantages and disadvantages of using a temporary test repository and a bot account for this purpose

Similar Threads

Similarity	Thread	Status
73%	#12 v0.0.2v: Foundation Integration (Gemini + Qdran...	—
73%	#1 feat: Initial implementation of Simili-bot v0.0.1	—
72%	#24 feat: Implement LLM Triage & Smart Routing System	—

_{Generated by Simili Bot}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/e2e-test.yml:
- Around line 138-141: The workflow currently hardcodes pushing to the main
branch via the git push origin main step, which breaks when a repo's default
branch is not main; change that push to publish the current branch instead (use
a push of HEAD with --set-upstream so the created branch is tracked) and keep
the preceding git add -A / git commit -m "chore: seed E2E test repo with Simili
config" steps unchanged.
- Around line 41-48: The workflow step that creates the test repo currently uses
gh repo create with the --public flag; change it to create a private repository
instead to avoid exposing secrets via external-triggered events. Update the
"Create test repository" step (the gh repo create invocation) to remove or
replace --public with --private (and ensure GH_TOKEN is still provided) so the
created test repo is private and cannot be used by external users to trigger
secret-exposed workflows.
- Around line 18-35: The e2e-test job will fail on forked PRs because it expects
the BOT_PAT secret; add a job-level condition to skip runs for forks by adding
an if conditional to the e2e-test job (e.g., use the job name e2e-test and set
if: github.event.pull_request.head.repo.fork == false or a defensive condition
if: github.event.pull_request == null ||
github.event.pull_request.head.repo.fork == false) so the job only runs for
non-fork PRs; optionally consider switching this workflow trigger to
workflow_run to run privileged steps with secrets in a trusted context instead
of pull_request.
- Around line 337-351: The step "Verify similarity detection" assigns
COMMENT_BODY unsafely using single quotes which breaks on apostrophes; instead
set COMMENT_BODY via the step's env: (e.g., env: COMMENT_BODY: '${{
steps.poll.outputs.comment_body }}') and use printf '%s' "$COMMENT_BODY" (not a
single-quoted heredoc) when piping into grep to preserve all characters, and
make the else branch exit non‑zero (e.g., exit 1) so the E2E check fails when
the similarity pattern isn't found; update the grep/conditional that references
COMMENT_BODY accordingly.

coderabbitai · 2026-02-18T06:21:18Z

.github/workflows/e2e-test.yml

+on:
+  pull_request:
+    branches: [main]
+    types: [opened, synchronize, reopened]
+
+permissions:
+  contents: read
+
+env:
+  # Unique names per workflow run to avoid collisions
+  TEST_REPO_NAME: "simili-e2e-${{ github.run_id }}"
+  TEST_COLLECTION: "simili-e2e-test-${{ github.run_id }}"
+
+jobs:
+  e2e-test:
+    name: E2E Integration Test
+    runs-on: ubuntu-latest
+    timeout-minutes: 15


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub Actions repository secrets forked pull request workflows

💡 Result:

Default behavior (public repos, and most cases): Workflows triggered by pull_request from a fork do not receive repository/organization/environment secrets. The only exception is GITHUB_TOKEN, which is read-only for forked PRs. (docs.github.com)

If you need secrets for PR automation: Use pull_request_target. It runs the workflow from the base repo’s code (trusted) and can access secrets—but it’s higher risk because it can be influenced by untrusted PR input; do not check out / execute fork code in the same job that has secrets. (github.blog)

Safer common pattern: Run untrusted CI on pull_request (no secrets), upload artifacts, then run a separate privileged workflow on workflow_run (has secrets) to process artifacts and post results back. (github.blog)

Private/internal repos can be configured: Repo/org/enterprise settings can allow fork PR workflows to (a) run, (b) get write GITHUB_TOKEN, (c) receive secrets, and/or (d) require manual approval before running. (docs.github.com)

Dependabot PRs: Treated like forks for these restrictions (read-only token, no secrets) on common PR-triggered events. (github.blog)

Citations:

1: https://docs.github.com/actions/reference/workflows-and-actions/events-that-trigger-workflows?utm_source=openai

2: https://github.blog/news-insights/product-news/github-actions-improvements-for-fork-and-pull-request-workflows/?utm_source=openai

3: https://github.blog/news-insights/product-news/github-actions-improvements-for-fork-and-pull-request-workflows/?utm_source=openai

4: https://docs.github.com/en/enterprise-server%403.14/repositories/managing-your-repositorys-settings-and-features/enabling-features-for-your-repository/managing-github-actions-settings-for-a-repository?utm_source=openai

5: https://github.blog/changelog/2021-02-19-github-actions-workflows-triggered-by-dependabot-prs-will-run-with-read-only-permissions/?utm_source=openai

🌐 Web query:

GitHub Actions fork PR actions reference SHA upstream repository

💡 Result:

For PRs from forks, the workflow runs in the upstream (base) repo, but the commit you get by default depends on the trigger:

on: pull_request: GITHUB_REF is refs/pull/<n>/merge and GITHUB_SHA is the temporary merge commit GitHub creates for the PR. To get the fork’s actual tip commit, use github.event.pull_request.head.sha. [1]

on: pull_request_target: the workflow runs in the base-branch context (and can access secrets), so you must be very careful about checking out/running fork code. [1][3]

Checkout the fork PR head SHA (the contributor’s commit) in the upstream workflow

- uses: actions/checkout@v4 with: repository: ${{ github.event.pull_request.head.repo.full_name }} ref: ${{ github.event.pull_request.head.sha }} fetch-depth: 0

actions/checkout supports both repository: (owner/name) and ref: (branch/tag/SHA). [2]

If you instead want the PR merge result (as-tested-for-merge)

On pull_request, the default checkout already corresponds to the PR merge ref / merge commit described above. [1][2]

Security note (important for forks)

Avoid pull_request_target unless you truly need its privileges, and don’t check out or execute untrusted fork code in privileged workflows (secrets/write tokens). [3]

Sources: GitHub event semantics for pull_request / pull_request_target and GITHUB_SHA/GITHUB_REF [1], actions/checkout inputs and PR head-SHA example [2], GitHub guidance on secure workflow triggers and avoiding untrusted checkout [3].

🌐 Web query:

GitHub Actions pull_request vs pull_request_target fork security

💡 Result:

Key difference (fork PR security)

Trigger Runs workflow from… Code it typically tests GITHUB_TOKEN / secrets on fork PRs Risk level

pull_request the PR’s merge ref (untrusted contribution context) PR code Read-only token, no repo secrets (by design, for fork safety) Lower citeturn0search5turn0search10

pull_request_target the base repo (trusted repo context) Workflow is from base; you may still choose to checkout PR code Has access to secrets and typically write-capable token, even when the PR comes from a fork Higher / easy to misuse citeturn0search10turn0search3turn0search0

Why pull_request_target is dangerous with forks

pull_request_target is intended for privileged “maintainer-side” automation (labeling, commenting, triage). The common foot-gun is: trigger on pull_request_target + checkout PR head + execute anything from the PR → attacker-controlled code can run while secrets/write permissions are available (“pwn request”). citeturn0search0turn0search3turn0search11

When to use which

Use pull_request for CI (build/test/lint) on PRs from forks, because it doesn’t expose secrets. citeturn0search5turn0search10

Use pull_request_target only when you must perform privileged actions on fork PRs (e.g., apply labels, post comments/status) and keep it from executing untrusted PR code. citeturn0search0turn0search7

Recent nuance (late 2025 change)

GitHub noted pull_request_target remains higher-risk with forks and also changed how environment branch protections evaluate for these runs (evaluating against the execution ref), which can affect how environment secrets are gated. citeturn0search1turn0search2

Gate the job to prevent failures on forked PRs.

The e2e-test job uses BOT_PAT secret but runs on pull_request. For forks, repository secrets are not exposed, so the job will fail when a fork PR is opened. Add a job-level condition to skip execution for fork PRs:

Suggested fix

jobs: e2e-test: + if: github.event.pull_request.head.repo.full_name == github.repository name: E2E Integration Test

Alternatively, consider using workflow_run (triggered after the PR workflow completes) to run privileged operations with secrets in a separate, trusted context.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 18 - 35, The e2e-test job will fail on forked PRs because it expects the BOT_PAT secret; add a job-level condition to skip runs for forks by adding an if conditional to the e2e-test job (e.g., use the job name e2e-test and set if: github.event.pull_request.head.repo.fork == false or a defensive condition if: github.event.pull_request == null || github.event.pull_request.head.repo.fork == false) so the job only runs for non-fork PRs; optionally consider switching this workflow trigger to workflow_run to run privileged steps with secrets in a trusted context instead of pull_request.

coderabbitai · 2026-02-18T06:21:18Z

.github/workflows/e2e-test.yml

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub Actions secrets access public repository issue triggered workflows security

💡 Result:

Public repositories are especially exposed to “untrusted-trigger” GitHub Actions risks: attackers can use events like issues / issue_comment / pull_request_target / workflow_run to get your privileged workflow (with repo secrets and often write-scoped GITHUB_TOKEN) to run attacker-influenced content, then exfiltrate secrets.

What’s the core security issue?

Fork PR workflows (pull_request) are intentionally constrained: no repository secrets (except GITHUB_TOKEN, and it’s typically read-only) to reduce blast radius from untrusted code. [1][2]

But some other triggers run in a trusted context (base repo context), which can access secrets:

pull_request_target runs as the base repository (privileged) and can access repository secrets; if you check out and run PR code, you can hand secrets to an attacker (“pwn request”). [3][4]

workflow_run can also be privileged; if it runs based on artifacts/inputs from an untrusted run, it can become an escalation path to secrets. [4]

issue_comment / issues-driven automation is dangerous if you use comment text or PR refs to decide what to check out/run; it can bypass normal PR trust boundaries if not carefully designed. (GitHub groups this under untrusted input/checkout risks.) [4]

GitHub has also fixed at least one historical weakness where external users could trigger a privileged run in certain edge cases, reinforcing that you should treat any untrusted input in workflows as a serious risk even if you think the event is “safe.” [1]

Practical guidance (secure patterns)

Prefer pull_request for CI that builds/tests PR code; assume no secrets will be available. [2]

Avoid pull_request_target unless you only do safe operations on base-repo code (e.g., label/comment) and do not check out / execute PR code. [3][4]

If you must interact with fork PRs and also need privileged actions, use split workflows (e.g., unprivileged build → reviewed/controlled privileged step) and follow GitHub’s “secure use” hardening guidance for untrusted checkout and inputs. [4]

Reduce token risk with minimal permissions: and avoid exposing secrets unnecessarily (GitHub explicitly recommends tightening permissions and handling untrusted input defensively). [1]

Sources: GitHub Security blog guidance and fixed issue context [1], GitHub Docs on secrets availability [2], GitHub Well-Architected warning on pull_request_target [3], GitHub “Secure use” guidance on privileged triggers/untrusted checkout [4].

Make the test repository private to prevent external users from triggering secret-exposed workflows.

Public repositories allow any external user to open issues or comments, which can trigger workflows with access to repository secrets (Qdrant/Gemini credentials). This is a known GitHub Actions security pattern where issues and issue_comment events run in the base repository context with full secret access.

🔒 Suggested fix

gh repo create "${{ secrets.BOT_USERNAME }}/${{ env.TEST_REPO_NAME }}" \ - --public \ + --private \ --description "Simili E2E test repo (auto-created, will be deleted)" \ --clone

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 41 - 48, The workflow step that creates the test repo currently uses gh repo create with the --public flag; change it to create a private repository instead to avoid exposing secrets via external-triggered events. Update the "Create test repository" step (the gh repo create invocation) to remove or replace --public with --private (and ensure GH_TOKEN is still provided) so the created test repo is private and cannot be used by external users to trigger secret-exposed workflows.

coderabbitai · 2026-02-18T06:21:18Z

.github/workflows/e2e-test.yml

⚠️ Potential issue | 🟠 Major

Don’t assume the default branch is main when pushing seed content.

Line 141 pushes to main, but newly created repos can default to master (user/org setting). If the default branch isn’t main, the workflow file lands on a non-default branch and won’t run. Push the current branch instead.

✅ Suggested fix

- git push origin main + git push origin HEAD

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Commit and push

git add -A

git commit -m "chore: seed E2E test repo with Simili config"

git push origin main

# Commit and push

git add -A

git commit -m "chore: seed E2E test repo with Simili config"

git push origin HEAD

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 138 - 141, The workflow currently hardcodes pushing to the main branch via the git push origin main step, which breaks when a repo's default branch is not main; change that push to publish the current branch instead (use a push of HEAD with --set-upstream so the created branch is tracked) and keep the preceding git add -A / git commit -m "chore: seed E2E test repo with Simili config" steps unchanged.

coderabbitai · 2026-02-18T06:21:18Z

.github/workflows/e2e-test.yml

+      - name: Verify similarity detection
+        if: steps.poll.outputs.bot_commented == 'true'
+        run: |
+          COMMENT_BODY='${{ steps.poll.outputs.comment_body }}'
+
+          echo "::group::Checking similarity detection"
+          # The bot should have found the similar seed issue about login crashes
+          # Check if the comment references issue #1 or contains similarity language
+          if echo "$COMMENT_BODY" | grep -qiE "(similar|duplicate|related|#1|crash.*login|login.*crash)"; then
+            echo "✅ Bot detected similarity correctly!"
+          else
+            echo "⚠️ Bot responded but similarity reference not found in comment."
+            echo "This may indicate the similarity threshold needs tuning."
+            echo "Comment body was: $COMMENT_BODY"
+          fi


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Test 1: Demonstrate why single-quoted assignment breaks on apostrophes echo "=== Test 1: Single quotes with apostrophe ===" set +e bash -c "COMMENT_BODY='don't' && echo \$COMMENT_BODY" 2>&1 echo "Exit code: $?" echo "" echo "=== Test 2: Single quotes with newline ===" # Try with newline bash -c "COMMENT_BODY='line1 line2' && echo \"\$COMMENT_BODY\"" 2>&1 echo "Exit code: $?" echo "" echo "=== Test 3: Using env variable (GitHub Actions style) ===" # Simulate what env variable would do bash -c "export COMMENT_BODY='don'\''t'; echo \"\$COMMENT_BODY\"" 2>&1 echo "Exit code: $?" echo "" echo "=== Test 4: Proper unquoted expansion with env ===" # This is how GitHub Actions env vars work bash -c 'export COMMENT_BODY="don'"'"'t"; echo "$COMMENT_BODY"' 2>&1 echo "Exit code: $?" echo "" echo "=== Test 5: printf with proper expansion ===" bash -c 'COMMENT_BODY="don'"'"'t"; printf "%s\n" "$COMMENT_BODY"' 2>&1 echo "Exit code: $?"

Repository: similigh/simili-bot

Length of output: 468

🌐 Web query:

GitHub Actions set environment variable step YAML best practices bash

💡 Result:

Best practices (GitHub Actions + Bash) for setting env vars in a step

Prefer YAML env: when the value is known at workflow/job/step definition time (static config). It scopes cleanly (workflow, job, or step). [1]

Use GITHUB_ENV to persist variables to later steps in the same job.

- name: Compute value for later steps shell: bash run: | echo "BUILD_SHA=${GITHUB_SHA}" >> "$GITHUB_ENV" - name: Use it shell: bash run: echo "$BUILD_SHA"

This is the current supported mechanism (“environment files”). [2]

Use GITHUB_OUTPUT for step outputs (data you reference as steps.<id>.outputs.<name>), not GITHUB_ENV.

- name: Produce output id: meta shell: bash run: | echo "version=1.2.3" >> "$GITHUB_OUTPUT" - name: Consume output shell: bash run: echo "${{ steps.meta.outputs.version }}"

This is the recommended replacement path for deprecated set-output. [3]

Avoid deprecated workflow commands (::set-env::, ::set-output::). GitHub recommends migrating to environment files (GITHUB_ENV, GITHUB_OUTPUT). [3]

Handle multiline values with the environment-file “heredoc” form (don’t try to cram newlines into NAME=value).
Example pattern (works for both GITHUB_ENV and GITHUB_OUTPUT): [2]

{ echo "MY_MULTILINE<<EOF" printf '%s\n' "$VALUE" echo "EOF" } >> "$GITHUB_ENV"

Quote the file variables and be strict about encoding/line endings. Use >> "$GITHUB_ENV" / >> "$GITHUB_OUTPUT" (note the quotes). GitHub documents UTF-8 requirements for environment files (notably for some shells on Windows). [2]

Treat anything that can be influenced by untrusted input as dangerous: sanitize to prevent env-var injection. Newlines or delimiter tricks can inject extra variables when writing to GITHUB_ENV; validate/sanitize or remove newlines for single-line vars. [4]

Sources: GitHub Docs on variables (env:) [1], workflow commands & environment files (GITHUB_ENV, heredoc, encoding) [2], GitHub changelog guidance to move off set-output to environment files [3], CodeQL guidance on GITHUB_ENV injection risks & sanitization [4].

Fix unsafe single-quoted assignment of multi-line bot comments.

Line 340 assigns COMMENT_BODY using single quotes. If the bot comment includes apostrophes (very likely in natural language like "don't" or "isn't"), bash will error and the step fails. Use the env: block pattern and printf to preserve content safely. Also, consider failing the step when similarity isn't found so the E2E check actually gates regressions.

🛠️ Suggested fix

- name: Verify similarity detection if: steps.poll.outputs.bot_commented == 'true' + env: + COMMENT_BODY: ${{ steps.poll.outputs.comment_body }} run: | - COMMENT_BODY='${{ steps.poll.outputs.comment_body }}' - echo "::group::Checking similarity detection" # The bot should have found the similar seed issue about login crashes # Check if the comment references issue `#1` or contains similarity language - if echo "$COMMENT_BODY" | grep -qiE "(similar|duplicate|related|#1|crash.*login|login.*crash)"; then + if printf '%s' "$COMMENT_BODY" | grep -qiE "(similar|duplicate|related|#1|crash.*login|login.*crash)"; then echo "✅ Bot detected similarity correctly!" else echo "⚠️ Bot responded but similarity reference not found in comment." echo "This may indicate the similarity threshold needs tuning." echo "Comment body was: $COMMENT_BODY" + exit 1 fi

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 337 - 351, The step "Verify similarity detection" assigns COMMENT_BODY unsafely using single quotes which breaks on apostrophes; instead set COMMENT_BODY via the step's env: (e.g., env: COMMENT_BODY: '${{ steps.poll.outputs.comment_body }}') and use printf '%s' "$COMMENT_BODY" (not a single-quoted heredoc) when piping into grep to preserve all characters, and make the else branch exit non‑zero (e.g., exit 1) so the E2E check fails when the similarity pattern isn't found; update the grep/conditional that references COMMENT_BODY accordingly.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

.github/workflows/e2e-test.yml (2)

274-276: Consider polling instead of fixed sleep for index settlement.

A 15-second fixed delay may be fragile if Qdrant indexing or the embedding API experiences latency. Consider polling the Qdrant collection to verify the expected document count (5 issues) before proceeding.

💡 Suggested improvement

       - name: Wait for index to settle
         run: |
-          sleep 15
+          echo "Waiting for Qdrant collection to have 5 documents..."
+          MAX_ATTEMPTS=10
+          for i in $(seq 1 $MAX_ATTEMPTS); do
+            COUNT=$(curl -s "${{ secrets.QDRANT_URL }}/collections/${{ env.TEST_COLLECTION }}" \
+              -H "api-key: ${{ secrets.QDRANT_API_KEY }}" | jq -r '.result.points_count // 0')
+            echo "  Attempt $i: $COUNT documents indexed"
+            if [ "$COUNT" -ge 5 ]; then
+              echo "✅ Indexing complete"
+              break
+            fi
+            sleep 5
+          done

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-test.yml around lines 274 - 276, Replace the fixed
"Wait for index to settle" sleep step with a polling loop that queries the
Qdrant collection until the expected document count (5 issues) is observed or a
timeout is reached; specifically, implement a step named "Wait for index to
settle" that repeatedly calls the Qdrant collection stats/points API (using your
QDRANT_URL or QDRANT_HOST + QDRANT_COLLECTION env vars) and checks the returned
point/document count, sleeping briefly between attempts and failing the job if
the count doesn't reach 5 within a sensible timeout (e.g., 120s).

329-331: Heredoc delimiter collision risk is low but present.

The COMMENT_EOF delimiter could theoretically appear in a bot comment body, causing output corruption. For E2E tests this risk is acceptable, but using a randomized delimiter would be more robust.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-test.yml around lines 329 - 331, The heredoc delimiter
COMMENT_EOF used when writing to $GITHUB_OUTPUT can collide with content in
COMMENT_BODY; update the workflow to generate and use a randomized/unique
delimiter (e.g., derive a token from a random value or $GITHUB_RUN_ID) instead
of the fixed COMMENT_EOF, and use that token consistently in the three lines
that reference COMMENT_EOF and COMMENT_BODY so the here-doc cannot be
prematurely terminated; ensure the generated delimiter variable is interpolated
when echoing the start and end markers and when appending COMMENT_BODY to
$GITHUB_OUTPUT.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/e2e-test.yml:
- Around line 463-476: The summary step ("Test summary") currently hardcodes ✅
for "Test Repo Created" and "Cleanup"; change it to check the actual step
outcomes instead of fixed icons by referencing the producing step IDs (e.g., use
steps.create_repo.outcome for the repository creation result and
steps.cleanup.outcome for cleanup) and conditionally echo ✅ or ❌ just like the
existing check that uses steps.poll.outputs.bot_commented; update the summary
logic to use if [ "${{ steps.<id>.outcome }}" = "success" ] ... else ... fi for
each status row so the table reflects real outcomes.
- Around line 41-52: The "Validate Secrets" step currently only checks BOT_PAT
and BOT_USERNAME; add similar presence checks for QDRANT_URL, QDRANT_API_KEY,
and GEMINI_API_KEY so the workflow fails fast with clear messages. Update the
step (Validate Secrets) to test each of the five secrets (BOT_PAT, BOT_USERNAME,
QDRANT_URL, QDRANT_API_KEY, GEMINI_API_KEY) with the same pattern used for the
existing checks: if a secret is empty echo a clear "::error::" message naming
the missing secret and exit 1, otherwise keep the final success echo. Ensure the
error messages match the style used for BOT_PAT/BOT_USERNAME so failures are
informative.

---

Nitpick comments:
In @.github/workflows/e2e-test.yml:
- Around line 274-276: Replace the fixed "Wait for index to settle" sleep step
with a polling loop that queries the Qdrant collection until the expected
document count (5 issues) is observed or a timeout is reached; specifically,
implement a step named "Wait for index to settle" that repeatedly calls the
Qdrant collection stats/points API (using your QDRANT_URL or QDRANT_HOST +
QDRANT_COLLECTION env vars) and checks the returned point/document count,
sleeping briefly between attempts and failing the job if the count doesn't reach
5 within a sensible timeout (e.g., 120s).
- Around line 329-331: The heredoc delimiter COMMENT_EOF used when writing to
$GITHUB_OUTPUT can collide with content in COMMENT_BODY; update the workflow to
generate and use a randomized/unique delimiter (e.g., derive a token from a
random value or $GITHUB_RUN_ID) instead of the fixed COMMENT_EOF, and use that
token consistently in the three lines that reference COMMENT_EOF and
COMMENT_BODY so the here-doc cannot be prematurely terminated; ensure the
generated delimiter variable is interpolated when echoing the start and end
markers and when appending COMMENT_BODY to $GITHUB_OUTPUT.

coderabbitai · 2026-02-18T06:25:59Z

.github/workflows/e2e-test.yml

⚠️ Potential issue | 🟡 Minor

Incomplete secret validation — missing checks for Qdrant and Gemini secrets.

The header (lines 11-13) documents five required secrets, but only BOT_PAT and BOT_USERNAME are validated. The workflow will fail later with less helpful errors if QDRANT_URL, QDRANT_API_KEY, or GEMINI_API_KEY are missing.

✅ Suggested fix

if [ -z "${{ secrets.BOT_USERNAME }}" ]; then echo "::error::BOT_USERNAME secret is missing! Please add it to Settings -> Secrets and variables -> Actions." exit 1 fi + if [ -z "${{ secrets.QDRANT_URL }}" ]; then + echo "::error::QDRANT_URL secret is missing! Please add it to Settings -> Secrets and variables -> Actions." + exit 1 + fi + if [ -z "${{ secrets.QDRANT_API_KEY }}" ]; then + echo "::error::QDRANT_API_KEY secret is missing! Please add it to Settings -> Secrets and variables -> Actions." + exit 1 + fi + if [ -z "${{ secrets.GEMINI_API_KEY }}" ]; then + echo "::error::GEMINI_API_KEY secret is missing! Please add it to Settings -> Secrets and variables -> Actions." + exit 1 + fi echo "✅ Required secrets are present."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 41 - 52, The "Validate Secrets" step currently only checks BOT_PAT and BOT_USERNAME; add similar presence checks for QDRANT_URL, QDRANT_API_KEY, and GEMINI_API_KEY so the workflow fails fast with clear messages. Update the step (Validate Secrets) to test each of the five secrets (BOT_PAT, BOT_USERNAME, QDRANT_URL, QDRANT_API_KEY, GEMINI_API_KEY) with the same pattern used for the existing checks: if a secret is empty echo a clear "::error::" message naming the missing secret and exit 1, otherwise keep the final success echo. Ensure the error messages match the style used for BOT_PAT/BOT_USERNAME so failures are informative.

coderabbitai · 2026-02-18T06:25:59Z

.github/workflows/e2e-test.yml

⚠️ Potential issue | 🟡 Minor

Summary table shows hardcoded success statuses regardless of actual outcomes.

Lines 470 and 476 always display ✅ for "Test Repo Created" and "Cleanup" even if those steps failed. This makes the summary misleading on failure runs. Consider using step outcome checks or capturing status in outputs.

💡 Suggested improvement pattern

- name: Create test repository id: create_repo # ... existing code ... # In summary step: - name: Test summary if: always() run: | echo "## E2E Test Summary" >> $GITHUB_STEP_SUMMARY echo "" >> $GITHUB_STEP_SUMMARY echo "| Check | Result |" >> $GITHUB_STEP_SUMMARY echo "|-------|--------|" >> $GITHUB_STEP_SUMMARY if [ "${{ steps.create_repo.outcome }}" = "success" ]; then echo "| Test Repo Created | ✅ |" >> $GITHUB_STEP_SUMMARY else echo "| Test Repo Created | ❌ |" >> $GITHUB_STEP_SUMMARY fi # ... etc for other steps

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 463 - 476, The summary step ("Test summary") currently hardcodes ✅ for "Test Repo Created" and "Cleanup"; change it to check the actual step outcomes instead of fixed icons by referencing the producing step IDs (e.g., use steps.create_repo.outcome for the repository creation result and steps.cleanup.outcome for cleanup) and conditionally echo ✅ or ❌ just like the existing check that uses steps.poll.outputs.bot_commented; update the summary logic to use if [ "${{ steps.<id>.outcome }}" = "success" ] ... else ... fi for each status row so the table reflects real outcomes.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/e2e-test.yml:
- Around line 26-30: Update the workflow env values for TEST_REPO_NAME and
TEST_COLLECTION to incorporate github.run_attempt so resource names are unique
across reruns; specifically, change the current names that only use
github.run_id to append the run attempt identifier for both TEST_REPO_NAME and
TEST_COLLECTION to prevent collisions when cleanup from a previous run failed.

---

Duplicate comments:
In @.github/workflows/e2e-test.yml:
- Around line 63-66: The workflow currently creates the test repository as
public; update the gh repo create invocation (the command using gh repo create
"${{ secrets.BOT_USERNAME }}/${{ env.TEST_REPO_NAME }}") to create a private
repo by replacing the --public flag with --private (leave the description and
other args intact) so the test repo does not expose secret-backed workflows.
- Around line 467-480: The "Test summary" step currently hardcodes success for
repo creation and cleanup; change it to read and display the actual step
outcomes (e.g., use steps.create_repo.outcome and steps.cleanup.outcome) when
writing to $GITHUB_STEP_SUMMARY, and keep the existing conditional for the bot
response (steps.poll.outputs.bot_commented). Update the lines that echo "| Test
Repo Created | ✅ |" and "| Cleanup | ✅ |" to evaluate the corresponding step
outcome variables and print a check or cross accordingly so the summary reflects
real results.
- Around line 356-371: The step "Verify similarity detection" assigns
COMMENT_BODY using single quotes which breaks on apostrophes and doesn't fail
the job when similarity isn't found; change the assignment to use double quotes
(COMMENT_BODY="${{ steps.poll.outputs.comment_body }}") or a safe heredoc so
apostrophes are preserved, then update the else branch of the similarity check
to exit non‑zero (e.g., call exit 1 after echoing the warning) so the workflow
fails when no similarity is detected.
- Around line 41-51: The Validate Secrets step currently only checks BOT_PAT and
BOT_USERNAME; extend it to also validate QDRANT_URL, QDRANT_API_KEY, and
GEMINI_API_KEY so failures happen early. Update the "Validate Secrets" run block
to test each of those env secrets (QDRANT_URL, QDRANT_API_KEY, GEMINI_API_KEY)
with the same pattern used for BOT_PAT/BOT_USERNAME and emit an ::error::
message and exit 1 if any are missing, ensuring the workflow fails fast during
the Validate Secrets step.
- Around line 157-160: Replace the hardcoded branch in the push command so new
repos aren't pinned to "main"; change the git push invocation "git push origin
HEAD:main" to push the current HEAD (e.g. "git push origin HEAD") or detect the
repo's default branch first (e.g. determine branch via "git rev-parse
--abbrev-ref HEAD" or GitHub API) and use that variable when calling git push;
update the workflow step that runs "git add -A" / "git commit -m" / "git push
origin HEAD:main" to use the HEAD-only push or a resolved default-branch
variable instead.
- Around line 18-35: The e2e-test job runs on forked PRs without secrets
(BOT_PAT) and fails; add a fork guard to the job by adding a conditional like
if: github.event.pull_request == null ||
github.event.pull_request.head.repo.fork == false so the e2e-test job only runs
when not from a fork (or when not a pull_request event); update the job block
for e2e-test (job name "e2e-test") to include that if condition so privileged
steps using BOT_PAT are skipped on forked PRs.

coderabbitai · 2026-02-18T06:38:33Z

.github/workflows/e2e-test.yml

+env:
+  # Unique names per workflow run to avoid collisions
+  TEST_REPO_NAME: "simili-e2e-${{ github.run_id }}"
+  TEST_COLLECTION: "simili-e2e-test-${{ github.run_id }}"
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub Actions context github.run_id github.run_attempt reruns

💡 Result:

github.run_id: Unique identifier for a workflow run within a repository, and it does not change when you re-run that run. Use it to refer to the same run across all attempts (e.g., REST API endpoints like /actions/runs/{run_id} and rerun endpoints use this ID). [1][2]

github.run_attempt: The attempt number for that workflow run. It starts at 1 for the first execution and increments by 1 on each re-run of the same run_id. [2]

Reruns and “attempts”: A rerun creates a new attempt of the same run, so you’ll see the same run_id with a higher run_attempt (and in the UI you can switch between attempts). [2]

API note: Some REST endpoints take both run_id and an attempt number (e.g., download logs for a specific attempt uses /actions/runs/{run_id}/attempts/{attempt_number}/logs). [1]

Example pattern to avoid infinite reruns:

- name: Auto-rerun (cap at 3 attempts) if: failure() && fromJSON(github.run_attempt) < 3 run: gh run rerun ${{ github.run_id }} --failed

(Logic relies on run_attempt increasing per rerun while run_id stays constant.) [2][1]

Sources: GitHub context properties (run_id, run_attempt) [2]; rerun behavior/UI attempts [2]; REST API attempt endpoints [1].

Avoid rerun collisions by including github.run_attempt in resource names.

github.run_id remains constant across reruns, so if cleanup fails the repo and collection names will collide on rerun, causing create/index steps to fail. Add github.run_attempt to ensure each attempt uses unique resource names.

Suggested fix

env: - TEST_REPO_NAME: "simili-e2e-${{ github.run_id }}" - TEST_COLLECTION: "simili-e2e-test-${{ github.run_id }}" + TEST_REPO_NAME: "simili-e2e-${{ github.run_id }}-${{ github.run_attempt }}" + TEST_COLLECTION: "simili-e2e-test-${{ github.run_id }}-${{ github.run_attempt }}"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 26 - 30, Update the workflow env values for TEST_REPO_NAME and TEST_COLLECTION to incorporate github.run_attempt so resource names are unique across reruns; specifically, change the current names that only use github.run_id to append the run attempt identifier for both TEST_REPO_NAME and TEST_COLLECTION to prevent collisions when cleanup from a previous run failed.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/e2e-test.yml:
- Around line 73-88: The workflow step "Seed test repository with config files"
currently enables shell xtrace with set -x before running secret-bearing
commands (gh auth login --with-token and gh auth setup-git); remove or move set
-x so that xtrace is disabled while those commands run (for example remove the
set -x or place it after gh auth setup-git), then re-enable xtrace afterward
only if needed; update the step that runs gh auth login --with-token and gh auth
setup-git to execute without xtrace enabled to avoid printing secrets to logs.
- Around line 235-239: The workflow step named "Set up Go" that uses
actions/setup-go@v5 currently sets go-version: '1.23' which mismatches go.mod;
update the go-version key value to '1.24.0' for this actions/setup-go@v5 step
and make the same change in the other workflow files that pin Go (so all
workflows match go 1.24.0).

---

Duplicate comments:
In @.github/workflows/e2e-test.yml:
- Around line 26-30: The TEST_REPO_NAME and TEST_COLLECTION env values currently
only use github.run_id which can collide on reruns; update their definitions to
append github.run_attempt so names become unique across reruns (modify the
TEST_REPO_NAME and TEST_COLLECTION environment variables to include both
github.run_id and github.run_attempt).
- Around line 63-66: The GH CLI repo creation command currently passes --public;
change it to --private so the temporary test repo is created as private (modify
the gh repo create invocation that includes --public to use --private instead),
ensuring workflows triggered cannot access secrets from external forks; update
any related description or flags if needed to reflect the private setting.
- Around line 41-51: The "Validate Secrets" job step currently only checks
BOT_PAT and BOT_USERNAME; extend this step to also validate all other required
CI secrets (e.g., QDRANT_API_KEY, QDRANT_URL, GEMINI_API_KEY) by adding similar
if [ -z "${{ secrets.<NAME> }}" ] checks that echo a clear "::error::" message
and exit 1 if missing, so the workflow fails early with explicit messages for
QDRANT_API_KEY, QDRANT_URL and GEMINI_API_KEY in the same "Validate Secrets"
step.
- Around line 18-35: The E2E job "e2e-test" runs on pull_request but uses
secrets and should be gated for forked PRs; update the job to include a
job-level condition such as if: github.event.pull_request == null ||
github.event.pull_request.head.repo.fork == false (or equivalently check
github.event_name and fork flag) so the job is skipped for forked repos, or
alternatively move any steps that require secrets out into a separate trusted
workflow; make this change on the "e2e-test" job definition to prevent runs from
untrusted forked PRs.
- Around line 361-375: The workflow step "Verify similarity detection" uses a
single-quoted shell assignment for COMMENT_BODY which breaks on apostrophes and
can hide failures; change it to export COMMENT_BODY from the job's env using
steps.poll.outputs.comment_body and populate it in the shell with printf (or
printf '%s' "$COMMENT_BODY") to safely handle quotes, then make the similarity
check (grep -qiE "(similar|duplicate|related|#1|crash.*login|login.*crash)")
cause the step to fail on no-match by exiting non-zero (e.g., echo the warning
and run exit 1) so missing similarity is treated as a failing test; update the
step name "Verify similarity detection" and references to COMMENT_BODY /
steps.poll.outputs.comment_body accordingly.
- Around line 162-166: Replace the hardcoded push target "git push origin
HEAD:main" with a push of the current branch so it works regardless of default
branch; specifically update the commit/push step that currently runs "git add
-A" / "git commit -m ..." to use "git push --set-upstream origin HEAD" (or "git
push origin HEAD") instead of "git push origin HEAD:main" so the current HEAD is
pushed to its remote branch.
- Around line 472-485: The summary currently always prints success for repo
creation and cleanup; change the two hardcoded echo lines to conditionally emit
✅/❌ by checking the corresponding step outcomes (e.g., use if [ "${{
steps.create_repo.outcome }}" = "success" ]; then echo "| Test Repo Created | ✅
|" ... else echo "| Test Repo Created | ❌ |" fi and similarly for the cleanup
step using its step id like steps.cleanup.outcome). Update the script in this
step to reference the actual step ids (e.g., create_repo and cleanup) so the
summary reflects real outcomes instead of always showing ✅.

coderabbitai · 2026-02-18T08:30:37Z

.github/workflows/e2e-test.yml

⚠️ Potential issue | 🟠 Major

Avoid set -x while handling secrets.

Line 77 enables xtrace just before secret-bearing commands (Line 82). Even with masking, this increases the risk of accidental disclosure in logs. Prefer disabling xtrace or re-enabling it only after auth.

🔐 Suggested fix

- set -x + set -euo pipefail

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 73 - 88, The workflow step "Seed test repository with config files" currently enables shell xtrace with set -x before running secret-bearing commands (gh auth login --with-token and gh auth setup-git); remove or move set -x so that xtrace is disabled while those commands run (for example remove the set -x or place it after gh auth setup-git), then re-enable xtrace afterward only if needed; update the step that runs gh auth login --with-token and gh auth setup-git to execute without xtrace enabled to avoid printing secrets to logs.

coderabbitai · 2026-02-18T08:30:37Z

.github/workflows/e2e-test.yml

+      - name: Set up Go
+        uses: actions/setup-go@v5
+        with:
+          go-version: '1.23'
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Show go.mod toolchain/go version directives (if present) fd -a 'go.mod' -x sed -n '1,25p' {} # Show other workflow Go versions for consistency rg -n "go-version" .github/workflows

Repository: similigh/simili-bot

Length of output: 1128

Update Go version in workflows to match go.mod requirement.

The go.mod specifies go 1.24.0, but all workflows including this one pin go-version: '1.23'. This mismatch across .github/workflows/ci.yml, .github/workflows/release.yml, and .github/workflows/e2e-test.yml can cause build inconsistencies or failures. Update all workflow versions to 1.24.0 to align with the module dependency.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-test.yml around lines 235 - 239, The workflow step named "Set up Go" that uses actions/setup-go@v5 currently sets go-version: '1.23' which mismatches go.mod; update the go-version key value to '1.24.0' for this actions/setup-go@v5 step and make the same change in the other workflow files that pin Go (so all workflows match go 1.24.0).

coderabbitai

🧹 Nitpick comments (1)

.github/workflows/e2e-test.yml (1)

284-285: Replace fixed sleep with readiness polling to reduce flakiness.

Line 284–285 uses a hardcoded 15s wait; indexing can be slower (or faster). Consider polling Qdrant/CLI for readiness with a timeout.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/e2e-test.yml around lines 284 - 285, The "Wait for index
to settle" step currently uses a fixed "sleep 15" which causes flakiness;
replace that step's "run: sleep 15" with a readiness-poll loop that queries the
service (e.g., Qdrant health/collections endpoint or the CLI) until the expected
collection/index reports loaded or a ready status, using retries with a short
interval and an overall timeout (fail the step if the timeout is reached);
ensure the step prints diagnostic output each attempt so CI logs show progress
and use an explicit non-zero exit on timeout instead of a fixed sleep.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In @.github/workflows/e2e-test.yml:
- Around line 26-29: Update the environment variable values for TEST_REPO_NAME
and TEST_COLLECTION to include github.run_attempt so resource names are unique
across reruns; specifically, append ${ { github.run_attempt }} to the existing
interpolations for TEST_REPO_NAME and TEST_COLLECTION (the env entries) to
prevent collisions when cleanup fails.
- Around line 362-376: The workflow currently assigns COMMENT_BODY using single
quotes which breaks on apostrophes and multiline content and does not fail the
job when similarity isn’t found; change the job to read the comment into an
environment variable from steps.poll.outputs.comment_body (use env: to set
COMMENT_BODY) and use printf to safely expand multiline/quoted content, then
keep the grep -qiE check on COMMENT_BODY and make the else branch call exit 1 so
the step/job fails when no similarity is detected; update references to
COMMENT_BODY and the existing if/grep block to ensure safe handling and proper
failure signaling.
- Around line 63-66: The workflow currently creates the test repo with gh repo
create using the --public flag; change that flag to --private so the repository
is created as private and prevents public issue/issue_comment triggers from
exposing secrets — update the gh repo create invocation (the command that
includes --public, the env.TEST_REPO_NAME and secrets.BOT_USERNAME usage) to use
--private instead.
- Around line 236-239: The workflow pins actions/setup-go@v5 with go-version:
'1.23' which may not match the module's go directive in go.mod; open the
workflow step using actions/setup-go and replace the hardcoded go-version value
with the exact version found in go.mod's "go" directive (or parameterize it via
a workflow input or matrix) so they match, then run the provided verification
script to confirm the go.mod version and the workflow go-version are identical.
- Around line 41-51: The "Validate Secrets" step currently only checks BOT_PAT
and BOT_USERNAME; extend it to also validate QDRANT_URL, QDRANT_API_KEY, and
GEMINI_API_KEY by adding the same pattern of checks used for BOT_* (if [ -z "${{
secrets.VAR }}" ]; then echo "::error::... missing" && exit 1; fi) so the job
fails early with clear error messages for each of QDRANT_URL, QDRANT_API_KEY,
and GEMINI_API_KEY.
- Around line 18-35: The e2e-test job runs for forked PRs which lack secrets;
add a job-level condition to only run for same-repo PRs by adding an if: to the
e2e-test job (reference job name e2e-test) such as if:
github.event.pull_request.head.repo.full_name == github.repository (or use if:
github.event.pull_request.head.repo.owner == github.repository_owner) so the job
is skipped for fork PRs that can't access secrets.
- Around line 73-84: The workflow currently enables shell xtrace with set -x in
the "Seed test repository with config files" step which can leak secrets during
gh auth login; change the script to not run secret-bearing commands under
xtrace—either remove the global set -x or disable it (set +x) immediately before
running echo "${{ secrets.BOT_PAT }}" | gh auth login --with-token and gh auth
setup-git, then re-enable xtrace afterward if needed; update the commands around
gh auth login --with-token and gh auth setup-git so no secret is printed to the
logs.
- Around line 473-486: The summary currently hardcodes success for "Test Repo
Created" and "Cleanup"; change those lines to inspect the actual step outcomes
(use GitHub Actions context like steps.<create_test_repo_step_id>.outcome or
.conclusion and steps.<cleanup_step_id>.outcome or .conclusion) and print ✅ when
the outcome is "success" and ❌ otherwise; keep the existing conditional for
bot_commented as-is. Locate the "Test summary" step and replace the hardcoded
echo lines for "Test Repo Created" and "Cleanup" with conditional bash blocks
that check steps.<create_test_repo_step_id>.outcome (or .conclusion) and
steps.<cleanup_step_id>.outcome (or .conclusion) and echo the corresponding line
to $GITHUB_STEP_SUMMARY.
- Around line 163-166: The workflow currently pushes specifically to main using
the git push origin HEAD:main command which assumes the default branch is main;
change that push to set the upstream to the current HEAD instead (e.g., git push
--set-upstream origin HEAD) so the seed commit is pushed to whatever default
branch the repo uses and avoids hardcoding "main"; update the line that
currently contains git push origin HEAD:main to use the upstream-aware push and
ensure upstream is set.

---

Nitpick comments:
In @.github/workflows/e2e-test.yml:
- Around line 284-285: The "Wait for index to settle" step currently uses a
fixed "sleep 15" which causes flakiness; replace that step's "run: sleep 15"
with a readiness-poll loop that queries the service (e.g., Qdrant
health/collections endpoint or the CLI) until the expected collection/index
reports loaded or a ready status, using retries with a short interval and an
overall timeout (fail the step if the timeout is reached); ensure the step
prints diagnostic output each attempt so CI logs show progress and use an
explicit non-zero exit on timeout instead of a fixed sleep.

github-actions · 2026-02-18T13:48:09Z

🧪 E2E Test

❌ Bot responded: no

Test repo → gh-simili-bot/simili-e2e-22142115468
Run → logs

_{Auto-generated by E2E pipeline}

github-actions · 2026-02-18T13:55:45Z

🧪 E2E Test

❌ Bot responded: no

Test repo → gh-simili-bot/simili-e2e-22142370319
Run → logs

_{Auto-generated by E2E pipeline}

github-actions · 2026-02-18T13:58:05Z

🧪 E2E Test

❌ Bot responded: no

Test repo → gh-simili-bot/simili-e2e-22142448599
Run → logs

_{Auto-generated by E2E pipeline}

github-actions · 2026-02-18T14:00:21Z

🧪 E2E Test

✅ Bot responded: yes

Test repo → gh-simili-bot/simili-e2e-22142645143
Run → logs

_{Auto-generated by E2E pipeline}

github-actions · 2026-02-18T14:07:37Z

🧪 E2E Test

✅ Bot responded: yes

Test repo → gh-simili-bot/simili-e2e-22142868184
Run → logs

_{Auto-generated by E2E pipeline}

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

… comment Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

- Clean YAML with proper heredoc indentation - Revert gatekeeper.go (no Go code changes needed) - Test repo is public and kept for inspection - Posts PR comment with test repo link - Updates test repo README with results Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

The gatekeeper blocks events from gh-simili-* users (including issues created by the bot). The SIMILI_E2E_TEST_USER env var bypass allows the E2E test to process the trigger issue created by the bot user. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

- Created private repositories for E2E tests - Hardened secret handling in the workflow - Restored event-aware gatekeeper bypass for E2E tests Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

- Fetch bot username unmasked to ensure E2E bypass works - Enable --verbose on bot run for better E2E debugging Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

args is not valid under a uses: step in GitHub Actions. Removed it — action.yml already handles CLI args internally. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

GH_PAT may not have issues:write on the dynamically created test repo. github.token inherits the workflow's declared permissions (issues: write) and always works. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

action.yml defaults github_token to github.token when not explicitly provided. The test repo workflow declares issues:write so the built-in token has the right scope. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

secrets.BOT_USERNAME is masked as *** in shell, so username comparisons in the poll loop always failed. Switch all run: blocks to use env.BOT_NAME (fetched unmasked via gh api user). Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

- Switch bot username from GITHUB_ENV to GITHUB_OUTPUT (step id: create_repo) — step outputs are not masked even when the value matches a registered secret - Add --repo flag to gh pr comment to fix 'not a git repository' error - Update all downstream steps to use steps.create_repo.outputs.bot_name Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

env.BOT_NAME was masked as *** causing the index command to run against an invalid repo path, resulting in 0 seed issues indexed and 0 similarity results in the E2E test. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

GitHub masks any value matching a registered secret everywhere (env vars, step outputs, expressions). Since gh-simili-bot is not sensitive (visible in all URLs/logs), declare it as a plain workflow env var so it is never masked. Removes the dynamic gh api user call and all steps.create_repo output references. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

When using github.token (the built-in Actions token), comments are posted under the github-actions[bot] identity, not the bot user's own account. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

Assigning multiline step output via BODY='...' fails when the body contains > characters (GitHub alert syntax). Pass via env var instead to avoid shell interpretation of special chars. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

github-actions · 2026-02-18T14:16:31Z

🧪 E2E Test

✅ Bot responded: yes

Test repo → gh-simili-bot/simili-e2e-22143188396
Run → logs

_{Auto-generated by E2E pipeline}

github-actions · 2026-02-18T14:17:46Z

🧪 E2E Test

✅ Bot responded: yes

Test repo → gh-simili-bot/simili-e2e-22143229951
Run → logs

_{Auto-generated by E2E pipeline}

gh-simili-bot added enhancement New feature or request testing ci labels Feb 18, 2026

coderabbitai bot reviewed Feb 18, 2026

View reviewed changes

Kavirubc force-pushed the feat/e2e-test-pipeline branch from 0560285 to 5f41d3a Compare February 18, 2026 09:27

Kavirubc force-pushed the feat/e2e-test-pipeline branch from 1967eb4 to 770ca71 Compare February 18, 2026 14:12

Kavirubc added 13 commits February 18, 2026 19:43

ci: add E2E integration test pipeline

34f61c9

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: add secret validation to E2E test

5aa0e41

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: explicit gh auth login in E2E test

8336fb1

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: remove GH_TOKEN env to allow explicit login

0e81095

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: fix push refspec HEAD:main

dc2a816

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: add set -x to debug seed step failure

3cc14d1

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use explicit git remote auth in E2E test

f302dad

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use gh auth setup-git for cleaner auth

22b21ed

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: update bot email for E2E user

b3cc2f2

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: unset GH_TOKEN in seed step

545ceba

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use private repo for E2E tests to avoid secret scanning

eb93118

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: fix gh api boolean input using jq

e1fdfef

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: remove unnecessary actions permission api call

fd0ff3a

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

Kavirubc added 19 commits February 18, 2026 19:43

feat: allow E2E user bypass in gatekeeper for testing

ae02d4f

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: delay triage workflow installation until after seeding

8edb229

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: fix step split for delayed workflow installation

4167ef7

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use BOT_PAT in triage workflow to ensure write access

468d482

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: rewrite E2E workflow — fix heredoc indentation, keep repo, add PR…

22c10d7

… comment Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: secure E2E pipeline and use private test repositories

b6e7dc8

- Created private repositories for E2E tests - Hardened secret handling in the workflow - Restored event-aware gatekeeper bypass for E2E tests Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

Test

394013b

Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: unmask E2E bypass username and enable verbose logging

09039be

- Fetch bot username unmasked to ensure E2E bypass works - Enable --verbose on bot run for better E2E debugging Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: fix invalid args key in generated triage.yml

088a408

args is not valid under a uses: step in GitHub Actions. Removed it — action.yml already handles CLI args internally. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use github.token for comment posting in test repo

f9e9ee7

GH_PAT may not have issues:write on the dynamically created test repo. github.token inherits the workflow's declared permissions (issues: write) and always works. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: use default github.token for comment posting (no PAT needed)

7f32057

action.yml defaults github_token to github.token when not explicitly provided. The test repo workflow declares issues:write so the built-in token has the right scope. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: fix last BOT_NAME reference in Index seed issues step

624bca1

env.BOT_NAME was masked as *** causing the index command to run against an invalid repo path, resulting in 0 seed issues indexed and 0 similarity results in the E2E test. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

ci: poll for github-actions[bot] comments, not gh-simili-bot

d4a0c36

When using github.token (the built-in Actions token), comments are posted under the github-actions[bot] identity, not the bot user's own account. Signed-off-by: Kavirubc <hapuarachchikaviru@gmail.com>

Kavirubc force-pushed the feat/e2e-test-pipeline branch from 770ca71 to 96ebc47 Compare February 18, 2026 14:13

gh-simili-bot mentioned this pull request Feb 18, 2026

[0.2.0][CI] Add E2E integration test pipeline #72

Closed

5 tasks

Kavirubc linked an issue Feb 18, 2026 that may be closed by this pull request

[0.2.0][CI] Add E2E integration test pipeline #72

Closed

5 tasks

Kavirubc merged commit efdc338 into main Feb 18, 2026
6 checks passed

Kavirubc deleted the feat/e2e-test-pipeline branch February 18, 2026 14:18

gh-simili-bot mentioned this pull request Feb 18, 2026

Bug: E2E fails on forked PRs due to missing secrets #77

Closed

coderabbitai bot mentioned this pull request Feb 18, 2026

fix: run E2E only when review is approved (secure for forks) #78

Merged

Trigger	Runs workflow from…	Code it typically tests	`GITHUB_TOKEN` / secrets on fork PRs	Risk level
`pull_request`	the PR’s merge ref (untrusted contribution context)	PR code	Read-only token, no repo secrets (by design, for fork safety)	Lower citeturn0search5turn0search10
`pull_request_target`	the base repo (trusted repo context)	Workflow is from base; you may still choose to checkout PR code	Has access to secrets and typically write-capable token, even when the PR comes from a fork	Higher / easy to misuse citeturn0search10turn0search3turn0search0

Conversation

Kavirubc commented Feb 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

gh-simili-bot commented Feb 18, 2026

Simili Triage Report

Classification

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Checkout the fork PR head SHA (the contributor’s commit) in the upstream workflow

If you instead want the PR merge result (as-tested-for-merge)

Security note (important for forks)

Key difference (fork PR security)

Why pull_request_target is dangerous with forks

When to use which

Recent nuance (late 2025 change)

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

What’s the core security issue?

Practical guidance (secure patterns)

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Best practices (GitHub Actions + Bash) for setting env vars in a step

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 18, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Feb 18, 2026

🧪 E2E Test

Uh oh!

github-actions bot commented Feb 18, 2026

🧪 E2E Test

Uh oh!

github-actions bot commented Feb 18, 2026

🧪 E2E Test

Uh oh!

github-actions bot commented Feb 18, 2026

Kavirubc commented Feb 18, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 18, 2026 •

edited

Loading

Why `pull_request_target` is dangerous with forks