Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions mongodb-query-index-check/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# `mongodb-query-index-check` GitHub Action

Reviews a pull request for **MongoDB queries that don't use an appropriate index**. For every changed or new MongoDB call (`.find`, `.findOne`, `.aggregate`, `.update*`, `.delete*`, `.findOneAnd*`, `.countDocuments`, `.distinct`, …) the action:

1. Cross-references the query's filter and sort fields against the canonical index definitions in [`@apify-packages/mongo-indexes`](https://github.com/apify/apify-core/tree/develop/src/packages/mongo-indexes/src) (sparse-fetched from `apify/apify-core@develop`, or read straight from the caller's workspace when the action runs on `apify-core` itself).
2. Invokes [`anthropics/claude-code-action`](https://github.com/anthropics/claude-code-action) (recent Opus) to apply an ESR-aware rubric (Equality → Sort → Range) and post inline review comments with severity tags (`🔴 critical`, `🟠 high`, `🟡 medium`, `🟢 low`).
3. Fails the check whenever a finding is reported (unless `request-changes: false`) — useful as a required check in branch protection.

The action runs a cheap pre-filter first (it lists PR files, glob-matches, and grep-checks for MongoDB call patterns in changed hunks) and only invokes Claude when something relevant changed. Repos that never touch MongoDB pay only the GitHub API cost of `pulls.listFiles`.

## Usage

### `apify-core` (the action reads its own workspace)

```yaml
# .github/workflows/mongodb_query_index_check.yaml
name: MongoDB query index check

on:
pull_request:
types: [opened, reopened, synchronize, ready_for_review]

jobs:
check:
if: github.event.pull_request.draft == false
runs-on: ubuntu-22.04-arm64
permissions:
contents: read
pull-requests: write
id-token: write
steps:
- uses: actions/checkout@v6
- uses: apify/actions/mongodb-query-index-check@v1
with:
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
```

### `apify-proxy`, `apify-web`, … (the action fetches indexes from `apify-core`)

```yaml
# .github/workflows/mongodb_query_index_check.yaml
name: MongoDB query index check

on:
pull_request:
types: [opened, reopened, synchronize, ready_for_review]

jobs:
check:
if: github.event.pull_request.draft == false
runs-on: ubuntu-22.04-arm64
permissions:
contents: read
pull-requests: write
id-token: write
steps:
- uses: actions/checkout@v6
- uses: apify/actions/mongodb-query-index-check@v1
with:
anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
# PAT with `contents: read` on apify/apify-core. The default GITHUB_TOKEN only sees the
# current repo, so without this the action would fail to fetch the indexes.
apify-core-token: ${{ secrets.APIFY_CORE_RO_TOKEN }}
```

## Inputs

| Name | Required | Default | Description |
| --- | --- | --- | --- |
| `anthropic-api-key` | yes | — | Anthropic API key passed through to `anthropics/claude-code-action`. |
| `github-token` | no | `${{ github.token }}` | Token used to post review comments. |
| `apify-core-token` | no | _(empty)_ | When set, fetches `mongo-indexes` from `apify/apify-core@develop`. When empty, the action assumes it is running on `apify-core` and reads `src/packages/mongo-indexes/src` from the workspace. |
| `max-turns` | no | `30` | Maximum turns Claude may take. |
| `paths` | no | TS/JS source files | Comma-separated globs to include. |
| `request-changes` | no | `true` | When `true`, fail the check on any finding. When `false`, comment only. |

## Outputs

| Name | Description |
| --- | --- |
| `should-run` | `true` when the pre-filter detected MongoDB changes and Claude was invoked, `false` otherwise. |
| `changed-files` | JSON array of files Claude reviewed. |
| `max-severity` | Highest severity found: `none`, `low`, `medium`, `high`, or `critical`. |

## How it works

1. **Validate inputs**: checks the event is `pull_request[_target]`, rejects fork PRs, validates `request-changes`, and seeds `$RESULT_PATH` for the Finalize step.
2. **Pre-filter** (`index.mts` → `preCheck()`): pages through `pulls.listFiles`, applies the `paths` glob and a fixed exclude list (`node_modules`, `dist`, `build`, tests, `mongo-indexes` package itself), and greps for MongoDB collection-method patterns in changed hunks. If nothing matches, the action sets `should-run=false` and exits before spending Anthropic credits.
3. **Source resolution**: either sparse-checkouts `apify/apify-core@develop` (when `apify-core-token` is set) into a workspace subdir, or points at the caller's `src/packages/mongo-indexes/src` directly.
4. **Prompt render**: substitutes the changed-files path, mongo-indexes directory, PR metadata, and request-changes mode into `prompts/review.md` via envsubst.
5. **Claude Code run**: invokes `anthropics/claude-code-action@v1` (recent Opus) with a tight allowlist — GitHub MCP for pull-request read and pending-review tools, `Read`, `Write` (for the result file), and a handful of read-only `Bash(...)` commands.
6. **Finalize**: reads the single-word severity Claude wrote to `${RUNNER_TEMP}/mongo-index-result.txt`. Exits non-zero when `request-changes: true` and Claude reported any finding; otherwise succeeds.

## Severity rubric

| Severity | Symptom |
| --- | --- |
| 🔴 critical | No index covers the query — collection scan. |
| 🟠 high | Index exists but doesn't match: prefix missed, partial-filter incompatible, sort can't use the index, unanchored `$regex` on indexed field. |
| 🟡 medium | Index used but inefficient: low selectivity, likely poor read/return ratio, wrong sort direction, `$or` branch without an index. |
| 🟢 low | Stylistic: tighter partial filter, covered-query opportunity, missing index name. |

Any finding turns the check red unless `request-changes` is set to `false`.

## Limitations

- **Fork PRs are rejected**: the action's Validate step fails fast when `head.repo` differs from `base.repo`. On `pull_request_target` this would otherwise hand a write-capable token to Claude while it analyses attacker-controlled diff content (prompt-injection risk); on `pull_request` it can't authenticate anyway. Internal PRs only.
- **JS array methods**: the pre-filter regex matches `.find(`, `.findOne(`, etc. on any object, so `array.find(x => …)` still triggers Claude to look — Claude then disambiguates by inspecting the receiver. This errs on the side of running more often, never less.
- **Dynamic collection access** (e.g. `db[name].findOne(...)`): Claude is instructed to skip findings where it can't determine the collection reliably.

## Releasing a new version

This action is published as part of the `apify/actions` repo. See the [repo README](../README.md) for the release-please flow.
186 changes: 186 additions & 0 deletions mongodb-query-index-check/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
name: 'MongoDB Query Index Check'
description: >-
Reviews a pull request for MongoDB queries that don't use an appropriate index. Cross-references
changed queries against the canonical mongo-indexes definitions, then asks Claude Code to post
inline review comments and (optionally) fail the check when any finding is reported.

inputs:
anthropic-api-key:
description: 'Anthropic API key passed through to anthropics/claude-code-action.'
required: true
github-token:
description: 'GitHub token used to post review comments. Defaults to GITHUB_TOKEN.'
required: false
default: ${{ github.token }}
apify-core-token:
description: >-
GitHub token with `contents: read` on apify/apify-core. When set, the action sparse-checkouts
`src/packages/mongo-indexes/src` from apify/apify-core@develop. When empty (the default), the
action assumes it is running on apify/apify-core itself and reads the indexes from the
caller's already-checked-out workspace at `src/packages/mongo-indexes/src`.
required: false
default: ''
max-turns:
description: 'Maximum turns Claude may take. Default 30.'
required: false
default: '30'
paths:
description: >-
Comma-separated glob patterns of files to inspect (matched against PR file paths).
Default covers TypeScript and JavaScript source files.
required: false
default: '**/*.ts,**/*.mts,**/*.cts,**/*.tsx,**/*.js,**/*.mjs,**/*.cjs,**/*.jsx'
request-changes:
description: 'If `true`, fail the check when any finding is reported. If `false`, leave comments only.'
required: false
default: 'true'

outputs:
should-run:
description: '`true` when MongoDB-related changes were detected (and Claude was invoked).'
value: ${{ steps.pre-check.outputs.should-run }}
changed-files:
description: 'JSON array of files Claude reviewed.'
value: ${{ steps.pre-check.outputs.changed-files }}
max-severity:
description: 'Highest severity in the review: one of `none`, `low`, `medium`, `high`, `critical`.'
value: ${{ steps.finalize.outputs.max-severity }}

runs:
using: composite
steps:
- name: Validate inputs
shell: bash
env:
EVENT_NAME: ${{ github.event_name }}
HEAD_REPO: ${{ github.event.pull_request.head.repo.full_name }}
BASE_REPO: ${{ github.event.pull_request.base.repo.full_name }}
REQUEST_CHANGES_INPUT: ${{ inputs.request-changes }}
run: |
set -euo pipefail
if [ "$EVENT_NAME" != "pull_request" ] && [ "$EVENT_NAME" != "pull_request_target" ]; then
echo "::error::This action only runs on 'pull_request' or 'pull_request_target' events (got '$EVENT_NAME')."
exit 1
fi
# Reject fork PRs: on `pull_request_target`, Claude would receive write-capable secrets
# (anthropic-api-key, apify-core-token) while analyzing attacker-controlled diff content,
# which is a prompt-injection vector. On `pull_request` it's mostly harmless (secrets aren't
# exposed to forks) but we still bail out so the action behaves predictably.
if [ -n "$HEAD_REPO" ] && [ -n "$BASE_REPO" ] && [ "$HEAD_REPO" != "$BASE_REPO" ]; then
echo "::error::This action does not support pull requests from forks ('$HEAD_REPO' → '$BASE_REPO'). Re-run from a branch in the base repository."
exit 1
fi
case "$REQUEST_CHANGES_INPUT" in
true|false) ;;
*) echo "::error::Invalid request-changes '$REQUEST_CHANGES_INPUT' (must be the literal string 'true' or 'false')."; exit 1 ;;
esac
# Seed the result file so Finalize (runs with `if: always()`) always sees a defined $RESULT_PATH.
echo "RESULT_PATH=${RUNNER_TEMP}/mongo-index-result.txt" >> "$GITHUB_ENV"
printf 'none' > "${RUNNER_TEMP}/mongo-index-result.txt"

- name: Pre-check PR diff
id: pre-check
uses: actions/github-script@v8
env:
INPUT_PATHS: ${{ inputs.paths }}
INPUT_PATHS_IGNORE: '**/node_modules/**,**/dist/**,**/build/**,**/test/**,**/__tests__/**,**/*.test.*,**/*.spec.*,**/mongo-indexes/**'
OUTPUT_CHANGED_FILES_PATH: ${{ runner.temp }}/mongo-index-changed-files.json
with:
github-token: ${{ inputs.github-token }}
script: |
const { preCheck } = require('${{ github.action_path }}/index.mts');
await preCheck({ github, context, core, env: process.env });

- name: Checkout apify-core mongo-indexes
if: steps.pre-check.outputs.should-run == 'true' && inputs.apify-core-token != ''
uses: actions/checkout@v6
with:
repository: apify/apify-core
ref: develop
token: ${{ inputs.apify-core-token }}
path: __mongo_index_check_apify_core
sparse-checkout: src/packages/mongo-indexes/src
sparse-checkout-cone-mode: false
fetch-depth: 1

- name: Resolve mongo-indexes directory
if: steps.pre-check.outputs.should-run == 'true'
shell: bash
env:
APIFY_CORE_TOKEN: ${{ inputs.apify-core-token }}
run: |
set -euo pipefail
if [ -n "$APIFY_CORE_TOKEN" ]; then
indexes_dir="${GITHUB_WORKSPACE}/__mongo_index_check_apify_core/src/packages/mongo-indexes/src"
origin_label="apify-core@develop"
else
indexes_dir="${GITHUB_WORKSPACE}/src/packages/mongo-indexes/src"
origin_label="local workspace (assuming caller is apify-core)"
fi
if [ ! -d "$indexes_dir" ]; then
echo "::error::Could not find mongo-indexes source directory at: $indexes_dir"
exit 1
fi
file_count=$(find "$indexes_dir" -maxdepth 2 -type f -name '*.ts' | wc -l)
echo "Resolved mongo-indexes from ${origin_label}: ${indexes_dir} (${file_count} .ts file(s))."
echo "MONGO_INDEXES_DIR=${indexes_dir}" >> "$GITHUB_ENV"

- name: Render Claude prompt
id: render
if: steps.pre-check.outputs.should-run == 'true'
shell: bash
env:
PR_NUMBER: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
CHANGED_FILES_PATH: ${{ runner.temp }}/mongo-index-changed-files.json
REQUEST_CHANGES_MODE: ${{ inputs.request-changes }}
run: |
set -euo pipefail
prompt_file="${RUNNER_TEMP}/mongo-index-prompt.md"
envsubst '$PR_NUMBER $REPO $BASE_SHA $HEAD_SHA $MONGO_INDEXES_DIR $CHANGED_FILES_PATH $RESULT_PATH $REQUEST_CHANGES_MODE' \
< "${GITHUB_ACTION_PATH}/prompts/review.md" \
> "$prompt_file"
delimiter="EOF_${RANDOM}${RANDOM}${RANDOM}"
{
echo "prompt<<${delimiter}"
cat "$prompt_file"
echo
echo "${delimiter}"
} >> "$GITHUB_OUTPUT"

- name: Run Claude Code review
if: steps.pre-check.outputs.should-run == 'true'
uses: anthropics/claude-code-action@v1
with:
anthropic_api_key: ${{ inputs.anthropic-api-key }}
github_token: ${{ inputs.github-token }}
prompt: ${{ steps.render.outputs.prompt }}
claude_args: >-
--max-turns ${{ inputs.max-turns }}
--model claude-opus-4-7
--allowedTools "mcp__github__pull_request_read,mcp__github__pull_request_review_write,mcp__github__create_pending_pull_request_review,mcp__github__submit_pending_pull_request_review,mcp__github__add_comment_to_pending_review,mcp__github_inline_comment__create_inline_comment,Read,Write,Bash(ls:*),Bash(cat:*),Bash(grep:*),Bash(rg:*),Bash(find:*),Bash(test:*),Bash(echo:*),Bash(printf:*),Bash(head:*),Bash(tail:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh pr review:*),Bash(gh api:*)"

- name: Finalize review result
id: finalize
if: always() && steps.pre-check.outputs.should-run == 'true'
shell: bash
env:
INPUT_REQUEST_CHANGES: ${{ inputs.request-changes }}
run: |
set -euo pipefail

max_severity="$(tr -d '[:space:]' < "$RESULT_PATH" | tr '[:upper:]' '[:lower:]')"
case "$max_severity" in
none|low|medium|high|critical) ;;
*) echo "::warning::Unexpected severity value '${max_severity}' in result file, treating as 'none'."; max_severity="none" ;;
esac

echo "max-severity=${max_severity}" >> "$GITHUB_OUTPUT"
echo "Max severity from review: ${max_severity}."

if [ "$INPUT_REQUEST_CHANGES" = "true" ] && [ "$max_severity" != "none" ]; then
echo "::error::MongoDB index review found '${max_severity}' issues. See the inline review comments on this PR."
exit 1
fi
Loading
Loading