feat(ci): supply-chain & security hardening (#443, #689, #690, #691, #692, #468, #552) by dgenio · Pull Request #718 · dgenio/contextweaver

dgenio · 2026-06-22T07:37:29Z

Summary

Coordinated supply-chain & security CI hardening, delivered as one PR under the
supply-chain hardening umbrella #443. Closes its decomposed sub-issues and
adjacent supply-chain items: #689 (CodeQL + dependency scanning), #690
(release attestations), #691 (SECURITY.md alignment + ownership checks),
#692 (security-exception runbook), #468 (release-integrity gates), and
#552 (OpenSSF Scorecard + badge).

The repo had strong functional CI but no automated security tooling — no CodeQL,
Dependabot, dependency audit, OpenSSF Scorecard, or release attestations. (The
push that opened this branch surfaced the gap directly: GitHub reports 6 open
Dependabot advisories on main.)

Closes #443
Closes #689
Closes #690
Closes #691
Closes #692
Closes #468
Closes #552

Changes

.github/workflows/codeql.yml (new) — CodeQL security-extended analysis on PR, main, and weekly (Enable CodeQL and dependency vulnerability workflows #689).
.github/workflows/pip-audit.yml (new) — dependency vuln scan: gating on core deps, report-only for the dev extra (Enable CodeQL and dependency vulnerability workflows #689).
.github/workflows/ossf-scorecard.yml (new) — OpenSSF Scorecard analysis; SARIF → code scanning; publish_results for the README badge (Apply for the OpenSSF Best Practices badge and surface project-health signals #552).
.github/dependabot.yml (new) — weekly grouped pip + github-actions updates ([CI] Supply-chain and security hardening: CodeQL, Dependabot, pip-audit, OpenSSF Scorecard, release attestations #443).
.github/workflows/publish.yml (edit) — new verify job gates publish: tag↔pyproject version check, pre-publish pytest, twine check; attestations: write + actions/attest-build-provenance on built artifacts (Add release-pipeline integrity gates: tag/version check, pre-publish tests, pinned actions, version-reference drift checks #468, Release provenance and artifact attestation workflow #690).
scripts/check_security_policy.py + tests/test_check_security_policy.py (new) — gating drift guard: SECURITY.md supported series must match pyproject.toml; relative links must resolve. Wired into make ci and ci.yml (Security-policy docs alignment and ownership checks #691).
SECURITY.md (edit) — supported series 0.14.x → 0.16.x; new "Automated Security Tooling" section linking the runbook (Security-policy docs alignment and ownership checks #691).
docs/security_tooling.md (new) + mkdocs.yml nav — triage SLA, ownership, and false-positive exception process (Exception process for security tooling noise #692).
README.md — OpenSSF Scorecard badge (Apply for the OpenSSF Best Practices badge and surface project-health signals #552).
scripts/check_readme_version.py — --print-version flag (single source of truth for the release gate).
Makefile, AGENTS.md, docs/agent-context/workflows.md, CHANGELOG.md — wire/record the new security-policy-check gate.

Why

Grounded in the triage of the open backlog: #689–#692 are formal sub-issues of
the #443 umbrella (Parent: #443), and #468/#552 are line-items of #443's own
proposed scope. They share one code area (.github/ + SECURITY.md) and one
implementation path, so a single focused PR is cleaner than seven.

How verified

Ran in an isolated venv (no src/ changed, so the heavy example/demo legs are deferred to CI):

ruff format --check on changed scripts/tests — 3 files already formatted
ruff check on changed scripts/tests — All checks passed!
mypy scripts/check_security_policy.py scripts/check_readme_version.py — Success: no issues found
pytest tests/test_check_security_policy.py tests/test_check_readme_version.py tests/test_check_doc_snippets.py tests/test_check_module_size.py — 31 passed
make security-policy-check — in sync (0.16.0); make readme-version-check — in sync (0.16.0)
Fails-without-fix proof: reverting SECURITY.md to 0.14.x makes check_security_policy.py exit 1 with the exact drift message; restoring exits 0.
python scripts/check_readme_version.py --print-version → 0.16.0 (drives the publish tag-gate)
All 8 workflow YAMLs + dependabot.yml + mkdocs.yml parse via yaml.safe_load.

Checklist

Tests added or updated for every new/changed public function (tests/test_check_security_policy.py)
[~] make ci passes locally — ran fmt + lint + type + targeted tests + both policy gates; the example/demo/full-matrix legs are deferred to CI (no src/ changes)
CHANGELOG.md updated under ## [Unreleased]
Docstrings added for all new public APIs (Google-style)
N/A — Public-API change? No src/ package surface changed (scripts/ is not part of api/public_api.txt)
Every modified module stays ≤ 300 lines (new script 151, test 84)
Related issues linked above
Agent-facing docs updated (AGENTS.md, docs/agent-context/workflows.md now list the new gate)

Notes for reviewers

Action SHA-pinning (one line-item of Add release-pipeline integrity gates: tag/version check, pre-publish tests, pinned actions, version-reference drift checks #468): intentionally deferred to Dependabot's github-actions updates rather than hand-pinning every workflow to commit SHAs. Hand-pinning contradicts the repo's existing @v4/@v5 tag idiom and would be a large, noisy, drift-prone diff. The new Dependabot github-actions ecosystem keeps tags current; OpenSSF Scorecard will still flag pinned-deps as an advisory and can be revisited. Documented in docs/security_tooling.md.
OpenSSF Best Practices badge (Apply for the OpenSSF Best Practices badge and surface project-health signals #552) requires a manual application at bestpractices.dev; tracked as a step in the runbook. The automated Scorecard badge ships now (resolves after the first main run).
Action versions used (codeql-action@v3, ossf/scorecard-action@v2, attest-build-provenance@v2, upload-artifact@v4) match current majors and the repo's tag convention.
server.json still reads 0.15.0 (separate pre-existing release-metadata drift) — left out of scope here; the new publish tag-gate covers pyproject version, not server.json.

🤖 Generated with Claude Code

https://claude.ai/code/session_0195S6jDSNCgWjmmLXXiDRSH

Generated by Claude Code

…692, #468, #552) Coordinated security-posture pass under the supply-chain hardening umbrella (#443), delivered as one PR: - CodeQL code scanning with the security-extended pack on PR/main/weekly (#689) - pip-audit dependency scanning: gating on core deps, report-only dev extra (#689) - OpenSSF Scorecard analysis + README badge; Best Practices badge tracked (#552) - Dependabot weekly pip + github-actions updates, grouped (#443) - Release-integrity verify job in publish.yml: tag<->version gate, pre-publish tests, twine check before upload (#468) - Build-provenance attestations for released artifacts (#690) - security-policy-check gate (scripts/check_security_policy.py) wired into make ci and ci.yml; refresh SECURITY.md supported series to 0.16.x (#691) - Security tooling runbook docs/security_tooling.md: triage SLA, ownership, false-positive exception process (#692) - check_readme_version.py gains --print-version for the release gate Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0195S6jDSNCgWjmmLXXiDRSH

github-advanced-security · 2026-06-22T07:38:41Z

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

Copilot

Pull request overview

This PR implements coordinated CI supply-chain and security hardening for contextweaver by adding automated security scanning (CodeQL, OpenSSF Scorecard, pip-audit, Dependabot), strengthening release integrity checks, and introducing a gating drift guard to keep SECURITY.md aligned with the package version and valid repo links.

Changes:

Added new security workflows: CodeQL scanning, OpenSSF Scorecard analysis (SARIF → code scanning), and pip-audit (gating core deps; report-only dev extra), plus Dependabot configuration.
Hardened the release pipeline with a pre-publish verification job and build-provenance attestations.
Added a gating security-policy-check (script + tests) and updated docs (SECURITY.md, runbook, mkdocs nav, agent/workflow docs, changelog) to reflect the new tooling.

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`tests/test_check_security_policy.py`	Adds unit tests for the new `SECURITY.md` drift/link guard, including a live “repo is in sync” assertion.
`SECURITY.md`	Updates supported minor series to `0.16.x` and documents automated security tooling + runbook link.
`scripts/check_security_policy.py`	Introduces the `SECURITY.md` supported-series drift check + relative link validation gate.
`scripts/check_readme_version.py`	Adds `--print-version` to expose the package version for release gating without re-parsing TOML in shell.
`README.md`	Adds the OpenSSF Scorecard badge.
`mkdocs.yml`	Adds the security tooling runbook to the docs nav.
`Makefile`	Adds `security-policy-check` target and wires it into `make ci`.
`docs/security_tooling.md`	New runbook documenting tooling, triage SLAs, and exception/suppression process.
`docs/agent-context/workflows.md`	Documents the new `make security-policy-check` gate as part of `make ci`.
`CHANGELOG.md`	Records the security hardening work under Unreleased.
`AGENTS.md`	Updates the documented `make ci` gate list to include `security-policy-check`.
`.github/workflows/publish.yml`	Adds a `verify` job (tag↔version, tests, `twine check`) and build-provenance attestation step.
`.github/workflows/pip-audit.yml`	New workflow running pip-audit with gating core deps and report-only dev extra.
`.github/workflows/ossf-scorecard.yml`	New Scorecard workflow publishing SARIF + results for the badge endpoint.
`.github/workflows/codeql.yml`	New CodeQL workflow using `security-extended` queries on PR/main/weekly schedule.
`.github/workflows/ci.yml`	Wires the new `scripts/check_security_policy.py` drift check into the gating CI workflow.
`.github/dependabot.yml`	New Dependabot config for weekly grouped pip updates and GitHub Actions updates.

…eQL label, harden link check, regen llms Review feedback on #718: - publish.yml: pin release-path actions (checkout, setup-python, attest-build-provenance, pypa publish) to immutable commit SHAs with `# vX` comments; Dependabot github-actions keeps them current. Addresses the #468 SHA-pinning line-item for the high-trust release job. - docs/security_tooling.md: correct the CodeQL row — code-scanning alerts are advisory and do not fail the PR check by default; it is not gating. - check_security_policy.py: find_broken_links now rejects absolute paths and ../ traversal instead of letting an existing-but-non-repo-relative target pass; add tests. - pip-audit.yml: report-only dev-extra audit uses `|| true` so it reports a green (non-blocking) check while keeping findings in the log. - Regenerate llms-full.txt for the new/changed security docs (drift gate). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0195S6jDSNCgWjmmLXXiDRSH

numpy>=2.5 (a transitive [dev] dep via chromadb/langgraph/crewai) ships .pyi stubs using PEP 695 `type` statements. Under the project's mypy `python_version = "3.10"` target these raise a hard syntax error ("Type statement is only supported in Python 3.12 and greater") that aborts the whole `mypy src/ examples/ scripts/` run on the 3.12/3.13 cells — a pre-existing dependency-drift break unrelated to contextweaver code. Add a scoped override (`follow_imports = "skip"` + `follow_imports_for_stubs`) so mypy treats numpy as `Any` without parsing its stubs. `follow_imports_for_stubs` is the load-bearing setting — without it the skip does not apply to .pyi files and the error persists. Validated against numpy 2.5.0 + mypy 2.1.0 on Python 3.12: the type gate goes green and numpy resolves to `Any` with no false errors. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_0195S6jDSNCgWjmmLXXiDRSH

github-actions · 2026-06-22T07:59:40Z

Benchmark delta (vs `main`)

Soft regression feedback only — this comment never blocks the PR.
Latency budget: ⚠️ when head > base × 1.3. Accuracy budget: ⚠️ when head < base - 1pp.

Routing summary (single backend × catalog sizes)

size	recall@k (head Δ vs base)	MRR (head Δ vs base)	p99 (ms)
50	✅ 0.5649 (+0.0000)	✅ 0.4978 (+0.0000)	⚠️ 1.046 (base 0.759)
83	✅ 0.3825 (+0.0000)	✅ 0.3242 (+0.0000)	✅ 0.840 (base 1.134)
1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 38.931 (base 41.711)

Per-backend × per-size matrix

backend	size	recall@k (Δ)	MRR (Δ)	p99 (ms)
bm25	100	✅ 0.3825 (+0.0000)	✅ 0.3399 (+0.0000)	✅ 6.557 (base 8.140)
bm25	500	✅ 0.2250 (+0.0000)	✅ 0.2165 (+0.0000)	✅ 29.691 (base 38.989)
bm25	1000	✅ 0.1575 (+0.0000)	✅ 0.1525 (+0.0000)	✅ 86.725 (base 111.716)
embedding_hashing	100	✅ 0.5175 (+0.0000)	✅ 0.4360 (+0.0000)	✅ 8.890 (base 7.225)
embedding_hashing	500	✅ 0.2700 (+0.0000)	✅ 0.2674 (+0.0000)	✅ 42.234 (base 44.182)
embedding_hashing	1000	✅ 0.2000 (+0.0000)	✅ 0.1931 (+0.0000)	✅ 99.769 (base 98.277)
embedding_st	100	skipped (skipped: missing sentence-transformers)	—	—
embedding_st	500	skipped (skipped: missing sentence-transformers)	—	—
embedding_st	1000	skipped (skipped: missing sentence-transformers)	—	—
fuzzy	100	skipped (skipped: missing rapidfuzz)	—	—
fuzzy	500	skipped (skipped: missing rapidfuzz)	—	—
fuzzy	1000	skipped (skipped: missing rapidfuzz)	—	—
tfidf	100	✅ 0.3825 (+0.0000)	✅ 0.3220 (+0.0000)	✅ 1.070 (base 1.102)
tfidf	500	✅ 0.2325 (+0.0000)	✅ 0.2314 (+0.0000)	✅ 9.595 (base 11.492)
tfidf	1000	✅ 0.1475 (+0.0000)	✅ 0.1456 (+0.0000)	✅ 37.372 (base 50.755)

Context pipeline (per scenario)

scenario	tokens	dropped	dedup
large_catalog	1480 (base 1514, Δ-34)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
long_conversation	2500 (base 2548, Δ-48)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
mixed_payload	488 (base 497, Δ-9)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
short_conversation	487 (base 496, Δ-9)	0 (base 0, Δ+0)	0 (base 0, Δ+0)
stress_conversation	6590 (base 6651, Δ-61)	11 (base 7, Δ+4)	4 (base 4, Δ+0)
tiny_payload	256 (base 267, Δ-11)	0 (base 0, Δ+0)	0 (base 0, Δ+0)

Numbers come from make benchmark / make benchmark-matrix.
Latency is hardware-dependent — treat the markers as a rough guide.
See benchmarks/scorecard.md for the full picture.

… flag The --print-version flag (added for the publish.yml release-integrity tag-gate, #468) had no test. A regression in its output — a trailing banner or extra line — would silently break the `[ "$tag" != "v$version" ]` comparison and either block a valid release or pass a mistagged one. Add a capsys test asserting `main(["--print-version"])` prints the bare pyproject version (exactly `<version>\n`) and returns 0. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01HCfLwmKtfqovpiFDuRuKea

Copilot AI review requested due to automatic review settings June 22, 2026 07:37

Copilot started reviewing on behalf of dgenio June 22, 2026 07:37 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread scripts/check_security_policy.py Outdated

Comment thread docs/security_tooling.md Outdated

Comment thread .github/workflows/publish.yml Outdated

claude added 2 commits June 22, 2026 07:45

dgenio merged commit d043fa0 into main Jun 22, 2026
13 checks passed

dgenio deleted the claude/issue-triage-grouping-6pwo4k branch June 22, 2026 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ci): supply-chain & security hardening (#443, #689, #690, #691, #692, #468, #552)#718

feat(ci): supply-chain & security hardening (#443, #689, #690, #691, #692, #468, #552)#718
dgenio merged 4 commits into
mainfrom
claude/issue-triage-grouping-6pwo4k

dgenio commented Jun 22, 2026

Uh oh!

github-advanced-security AI commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

dgenio commented Jun 22, 2026

Summary

Changes

Why

How verified

Checklist

Notes for reviewers

Uh oh!

github-advanced-security AI commented Jun 22, 2026

What Enabling Code Scanning Means:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark delta (vs main)

Routing summary (single backend × catalog sizes)

Per-backend × per-size matrix

Context pipeline (per scenario)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions Bot commented Jun 22, 2026 •

edited

Loading

Benchmark delta (vs `main`)