Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions .github/workflows/paper-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
name: Auto-improvement / Daily paper review

# Reads ~10 unseen entries from tmylla/Awesome-LLM4Cybersecurity each day,
# asks Claude Haiku whether they suggest a concrete Aigis hardening, and
# files candidates as auto-improvement/pending/ stubs + one GitHub issue
# summarising the batch. Code changes are NEVER made by this workflow —
# humans promote pending entries into rule PRs.
#
# Cost guard: 10 papers × Haiku 4.5 (≈500 tokens out) ≈ a few cents/day.
# State (paper_review_state.json) is committed back to master via a bot PR
# so the next run advances; if the commit step fails the run still leaves
# the issue + pending files visible.

on:
schedule:
# 00:15 UTC daily — well clear of cflite/codeql peak times.
- cron: "15 0 * * *"
workflow_dispatch:
inputs:
dry_run:
description: "Parse + pick only, no API calls or writes"
required: false
default: "false"
type: choice
options: ["false", "true"]
max_papers:
description: "How many unseen papers to review this run"
required: false
default: "10"

permissions:
contents: write # to commit state.json + pending stubs
issues: write # to file the daily summary issue
pull-requests: write # so the bot can open a PR with the new pending files

concurrency:
group: paper-review
cancel-in-progress: false

jobs:
review:
name: Review 10 papers and file pending candidates
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4

- name: Set up Python 3.11
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"

- name: Set up uv
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0

- name: Install anthropic SDK
run: uv pip install --system "anthropic>=0.40.0"

- name: Run paper review
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_REPOSITORY: ${{ github.repository }}
run: |
DRY="${{ inputs.dry_run || 'false' }}"
MAX="${{ inputs.max_papers || '10' }}"
ARGS="--max-papers $MAX"
if [ "$DRY" = "true" ]; then ARGS="$ARGS --dry-run"; fi
python auto-improvement/scripts/paper_review.py $ARGS

- name: Commit new pending + research + state on a bot branch
if: ${{ inputs.dry_run != 'true' }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
if git diff --quiet -- auto-improvement/; then
echo "No changes under auto-improvement/ — nothing to commit."
exit 0
fi
DATE="$(date -u +%Y-%m-%d)"
BRANCH="bot/paper-review/${DATE}"
git config user.name "aigis-paper-review[bot]"
git config user.email "aigis-paper-review@users.noreply.github.com"
git checkout -b "$BRANCH"
git add auto-improvement/
git commit -m "auto-improvement: daily paper review ${DATE}"
git push -u origin "$BRANCH"
gh pr create \
--title "auto-improvement: daily paper review ${DATE}" \
--body "Bot PR with the daily batch of pending/ stubs and updated state. See the linked issue for the candidate list." \
--label "auto-improvement" || true
22 changes: 21 additions & 1 deletion auto-improvement/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,29 @@ aigis を 6 時間ごとに自動強化する保守ループの作業領域。
|------|------|
| `ROTATION.md` | 10 領域ローテ定義 + 現在のカウンタ。毎回 +1 (mod 10) される |
| `INDEX.md` | 全実行回の時系列インデックス(1 行サマリ) |
| `research/` | 各回のリサーチレポート (UTC 名: `YYYY-MM-DDTHH-MM_NN-<domain>.md`) |
| `research/` | 各回のリサーチレポート (UTC 名: `YYYY-MM-DDTHH-MM_NN-<domain>.md` または `..._paper-batch.md`) |
| `changes/` | 各回の改修記録(追加機能・テスト結果・対応リサーチへのリンク) |
| `pending/` | 大幅方向転換の提案。実装は保留。人間が後で採否を判断 |
| `paper_review_state.json` | 後述「論文レビューループ」で読み終えた URL/タイトルの台帳 |
| `scripts/paper_review.py` | 論文レビューループ本体(毎日 GH Actions から起動) |

## 論文レビューループ(2026-05 追加)

[Awesome-LLM4Cybersecurity](https://github.com/tmylla/Awesome-LLM4Cybersecurity) を毎日 10 件ずつ読み進める半自動ループ。`.github/workflows/paper-review.yml` が 00:15 UTC に走り、`scripts/paper_review.py` が以下を行う:

1. 上流 `LITERATURES.md` を fetch
2. `paper_review_state.json` の既読 URL/タイトルを除外し、未読の新しい順から 10 件ピック
3. 各論文を Claude Haiku 4.5 に渡し、「Aigis に regex/部分一致で落とせる検出器候補があるか」を JSON で判定
4. relevant=true のものを `pending/YYYY-MM-DD_paper_<slug>.md` として draft 化
5. バッチ全体のサマリを `research/YYYY-MM-DDTHH-MM_paper-batch.md` に書き出し
6. `gh issue create` でレビュー依頼 Issue を 1 本オープン
7. 変更を bot ブランチで PR 化(人間がレビュー → master へマージ)

実装は一切しない。pending/ に積まれた候補は、既存のルール([ROTATION.md](ROTATION.md))と同じく、人間が個別 PR で `aigis/` 配下に昇格させる。

**必要な secrets:** `ANTHROPIC_API_KEY`(Anthropic console から発行、Settings → Secrets → Actions に登録)。未設定なら workflow は失敗するが、`workflow_dispatch` から `dry_run=true` でドライ実行は可能。

**コスト目安:** 10 件 × Haiku 4.5(≈500 出力トークン)≈ 数¢/日。月 $1 弱を想定。

## 運用ルール(保守エージェントが守る)

Expand Down
4 changes: 4 additions & 0 deletions auto-improvement/paper_review_state.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"seen": {},
"runs": []
}
Loading
Loading