feat: contributor mode, session awareness, universal RECOMMENDATION format (v0.3.11) by garrytan · Pull Request #82 · garrytan/gstack

garrytan · 2026-03-16T01:46:29Z

Summary

Contributor mode — gstack_contributor: true auto-files field reports to ~/.gstack/contributor-logs/ when gstack misbehaves. Casual format ("hey gstack team, ran into this..."), opens report for review, max 3/session, deduped by slug.
Concurrent session tracking — detects 3+ active gstack sessions in 2h window, enters "ELI16 mode" where every AskUserQuestion re-grounds the user on project/branch/task/question.
Universal RECOMMENDATION format — all skills now follow: context → question → RECOMMENDATION: Choose X because ___ → options. Plan-review skills reference this baseline.
Enum & Value Completeness — new CRITICAL review category in /review that traces new enum values through every consumer outside the diff.
Renamed {{UPDATE_CHECK}} → {{PREAMBLE}} across all 10 templates.
DRY'd plan-ceo-review and plan-eng-review AskUserQuestion rules.
Rewrote CONTRIBUTING.md with contributor workflow and cross-project testing guide.
Added vendored symlink awareness to CLAUDE.md.

Pre-Landing Review

No issues found. Changes are documentation, skill templates, generator script, and tests — no SQL, no database ops, no LLM output handling.

TODOS

No TODO items completed in this PR.

Test plan

All static tests pass (95 tests, 642 expect() calls, 0 failures)
Generated SKILL.md files are fresh (dry-run verified)
No unresolved {{UPDATE_CHECK}} placeholders remain in any template
New tests: contributor mode check, session awareness check in generated output

🤖 Generated with Claude Code

Design doc for Supabase-backed team data store and universal eval infrastructure. Covers architecture, credential storage, eval formats, YAML test case spec, Supabase schema, phased rollout, and security model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…e-store

…k-config - Replace project-specific references with generic language - Add missing fields to eval result format: prompt_sha, by_category, timestamp, response_preview - Enrich failure format with details array, scores dict, expectation_type - Add EVAL_JUDGE_CACHE, EVAL_VERBOSE, multiprocess worker support, dedup on push, run scopes, model aliases, judge profiles - Restructure credential storage to 4 layers with gstack-config (v0.3.9) for user preferences (sync_enabled, sync_transcripts) - Update integration points, observability, and reuse map Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

DRY up atomicWriteSync, readJSON, getGitInfo, getVersion, getRemoteSlug, and sanitizeForFilename from eval-store.ts, session-runner.ts, and eval-watch.ts into a shared module. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- lib/sync-config.ts: reads .gstack-sync.json + ~/.gstack/auth.json - lib/auth.ts: device auth flow (browser OAuth, local HTTP callback) - lib/sync.ts: Supabase push/pull via raw fetch(), offline queue, cache - lib/cli-sync.ts: CLI handler for gstack-sync commands - bin/gstack-sync: bash wrapper (setup, status, push-*, pull, drain) - .gstack-sync.json.example: template for team setup Zero new dependencies — uses raw fetch() against PostgREST API. All sync is non-fatal with 5s timeout and offline queue fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- 001_teams.sql: teams + team_members + RLS - 002_eval_runs.sql: eval results with universal format, indexes, upsert key - 003_data_tables.sql: retro, QA, ship, greptile, transcripts + RLS All tables use RLS: team members read/insert, admins delete. Transcript table has tighter policy (admin-only read). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- eval-store.ts: import shared getGitInfo/getVersion, add pushEvalRun() hook in finalize() (non-blocking, non-fatal) - session-runner.ts: import shared atomicWriteSync/sanitizeForFilename - eval-store.test.ts: fix pre-existing bug in double-finalize test (was counting _partial file) - 30 new tests for lib/util, lib/sync-config, lib/sync Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

DRY up eval I/O duplicated across scripts/eval-list.ts, eval-compare.ts, and eval-summary.ts. Adds EVAL_DIR constant, formatTimestamp(), listEvalFiles(), loadEvalResults() with --limit support. 13 new tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- lib/eval-format.ts: StandardEvalResult interfaces, validateEvalResult(), normalizeFromLegacy/normalizeToLegacy round-trip converters - lib/eval-tier.ts: EvalTier type, resolveTier/resolveJudgeTier from env, tierToModel mapping, TIER_ALIASES (haiku→fast, sonnet→standard, opus→full) - lib/eval-cost.ts: MODEL_PRICING (last verified 2025-05-01), computeCosts(), formatCostDashboard(), aggregateCosts(), fallback for unknown models - 42 tests across 3 test files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Cache at ~/.gstack/eval-cache/{suite}/{sha}.json. Compute cache keys from source file contents + test input via Bun.CryptoHasher SHA256. Supports read/write/stats/clear/verify operations. EVAL_CACHE=0 skips reads for force-rerun. 16 tests including corrupt JSON handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- lib/cli-eval.ts: routes to list/compare/summary/push/cost/cache/watch subcommands. Ports logic from 4 separate scripts into unified entry. Adds ANSI color for TTY (respects NO_COLOR), --limit flag for list. - bin/gstack-eval: bash wrapper matching bin/gstack-sync pattern - package.json: eval:* scripts now point to lib/cli-eval.ts - supabase/migrations/004_eval_costs.sql: per-model cost tracking + RLS - docs/eval-result-format.md: public format spec for any language - test/lib-eval-cli.test.ts: integration tests (spawn CLI subprocess) including 3 push failure modes (file-not-found, invalid schema, sync unavailable) 215 tests passing across 13 files. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Extract per-model token usage from resultLine.modelUsage (including cache tokens and exact API cost), flow CostEntry[] through EvalCollector, aggregate in finalize(). Extend CostEntry with cache_read_input_tokens, cache_creation_input_tokens, cost_usd. computeCosts() prefers exact cost_usd over MODEL_PRICING when available (~4x more accurate with prompt caching). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

callJudge/judge now return {result, meta} with SHA-based caching (~$0.18/run savings when SKILL.md unchanged) and dynamic model selection via EVAL_JUDGE_TIER env var. E2E tests pass --model from EVAL_TIER to claude -p. outcomeJudge retains simple return type. All 8 LLM eval test sites updated with real costs and costs[]. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

computeTrends() classifies tests as stable-pass/stable-fail/flaky/ improving/degrading based on pass rate, flip count, and recent streak. gstack eval trend shows sparkline table with --limit, --tier, --test filters. Guard CLI main block with import.meta.main to prevent execution on import. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Explain what team sync gives you, that it's optional, and how to set it up. Points to TEAM_COORDINATION_STORE.md for full guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove _comment hacks from JSON example file. Add short team sync section to README explaining what it is, that it's optional, and how to set it up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ormat - Rename {{UPDATE_CHECK}} → {{PREAMBLE}} across all 10 skill templates - Add session tracking (touch ~/.gstack/sessions/$PPID, count active sessions) - ELI16 mode when 3+ concurrent sessions detected (re-ground user on context) - Contributor mode: auto-file field reports to ~/.gstack/contributor-logs/ - Universal AskUserQuestion format: context → question → RECOMMENDATION → options - Update plan-ceo-review and plan-eng-review to reference preamble baseline - Add vendored symlink awareness section to CLAUDE.md - Rewrite CONTRIBUTING.md with contributor workflow and cross-project testing - Add tests for contributor mode and session awareness in generated output - Add E2E eval for contributor mode report filing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…to garrytan/dev-mode

Extract pushWithSync() helper to eliminate boilerplate across 6 push functions. Add pushHeartbeat() for connectivity testing. Add push-greptile to CLI. New commands: gstack-sync test (validates full push/pull flow via sync_heartbeats table), gstack-sync show (terminal team data dashboard with summary/evals/ships/retros views). Guard main block with import.meta.main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add non-fatal sync steps to all 4 skill templates: - /ship Step 8.5: write ship log JSON + push after PR creation - /retro Step 13: push snapshot after JSON save - /qa Phase 6.7: write qa-sync.json + push after health score - greptile-triage: push each triage entry after history file writes All calls use || true for zero disruption. Silent when sync not configured. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- 005_sync_heartbeats.sql migration for connectivity testing - eval:trend --team flag pulls team eval data (graceful fallback) - docs/TEAM_SYNC_SETUP.md step-by-step setup guide - Design doc status updated to Phase 2 complete - 10 new tests for sync show formatting functions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New CRITICAL review category that traces new enum values, status strings, and type constants through every consumer outside the diff. Catches the class of bugs where a new value is added but not handled in all switch/case chains, allowlists, or frontend-backend contracts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…to garrytan/dev-mode

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…richment Add session intelligence pipeline for team transcript sync: - lib/transcript-sync.ts: parse history.jsonl, enrich with Claude session file data (tools_used, full turn count), sync marker management, 10-concurrent push with 5-concurrent Haiku summarization - lib/llm-summarize.ts: raw fetch() to Anthropic Messages API (no SDK dep), retry-after on 429, exponential backoff on 5xx, SHA-based eval-cache - lib/sync.ts: pushTranscript() and pullTranscripts() following existing patterns - 006_transcript_sync.sql: unique index on (team_id, session_id) for idempotent upsert, RLS changed from admin-only to team-wide read Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ests - cli-sync.ts: push-transcript command, show sessions with formatSessionTable(), upgrade cmdSetup() to interactively create .gstack-sync.json if missing - bin/gstack-sync: add push-transcript case and help text - test/lib-llm-summarize.test.ts: 10 tests with mocked fetch (429 retry, 5xx backoff, malformed response, no API key, cache) - test/lib-transcript-sync.test.ts: 22 tests for parsing, grouping, session file extraction, marker management, slug resolution - test/lib-sync-show.test.ts: 4 tests for formatSessionTable Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- setup-team-sync/SKILL.md.tmpl: idempotent guided setup (create config, OAuth, verify connectivity, configure settings, summary) - ship/retro/qa SKILL.md.tmpl: add push-transcript hook after existing push-ship/push-retro/push-qa hooks (silent, non-fatal) - scripts/gen-skill-docs.ts: add setup-team-sync to template list - Regenerated all SKILL.md files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…weekly digest Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…to garrytan/dev-mode

garrytan · 2026-03-16T05:34:50Z

Superseded by clean branch — this one carried ~8600 lines of team-supabase-store code. See new PR.

garrytan and others added 27 commits March 15, 2026 01:32

Merge remote-tracking branch 'origin/main' into garrytan/team-supabas…

8931165

…e-store

chore: update gitignore

33c9552

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chore: bump v0.3.10, update CHANGELOG and docs

e280333

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: add setup comments to .gstack-sync.json.example

eb7ef21

Explain what team sync gives you, that it's optional, and how to set it up. Points to TEAM_COORDINATION_STORE.md for full guide. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: CHANGELOG covers full branch scope including team sync

1432046

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: clean up sync example, add team sync section to README

704fe34

Remove _comment hacks from JSON example file. Add short team sync section to README explaining what it is, that it's optional, and how to set it up. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/garrytan/team-supabase-store' in…

c11cb70

…to garrytan/dev-mode

Merge remote-tracking branch 'origin/garrytan/team-supabase-store' in…

b07e842

…to garrytan/dev-mode

chore: bump version and changelog (v0.3.11)

2d42e15

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

patovaldezf approved these changes Mar 16, 2026

View reviewed changes

garrytan and others added 2 commits March 16, 2026 00:15

garrytan and others added 3 commits March 16, 2026 00:15

docs: add team sync TODOs — streaming parser, effectiveness scoring, …

6e14689

…weekly digest Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Merge remote-tracking branch 'origin/garrytan/team-supabase-store' in…

cce407b

…to garrytan/dev-mode

garrytan changed the base branch from garrytan/team-supabase-store to main March 16, 2026 05:27

garrytan closed this Mar 16, 2026

garrytan mentioned this pull request Mar 16, 2026

feat: contributor mode, session awareness, recommendation format #90

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: contributor mode, session awareness, universal RECOMMENDATION format (v0.3.11)#82

feat: contributor mode, session awareness, universal RECOMMENDATION format (v0.3.11)#82
garrytan wants to merge 32 commits intomainfrom
garrytan/dev-mode

garrytan commented Mar 16, 2026

Uh oh!

garrytan commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

garrytan commented Mar 16, 2026

Summary

Pre-Landing Review

TODOS

Test plan

Uh oh!

garrytan commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants