Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
62 commits
Select commit Hold shift + click to select a range
ea40665
docs(29): capture phase context
SimplicityGuy May 16, 2026
878bc00
docs(state): record phase 29 context session
SimplicityGuy May 16, 2026
3ebe042
docs(29): UI design contract for Agents admin page
SimplicityGuy May 16, 2026
198b29e
docs(29): mark UI-SPEC approved after checker verification
SimplicityGuy May 16, 2026
6b3be89
docs(29): research deployment hardening + agents admin
SimplicityGuy May 16, 2026
857e98c
docs(29): add validation strategy
SimplicityGuy May 16, 2026
7465570
docs(29): pattern mapping for deployment hardening + agents admin
SimplicityGuy May 16, 2026
a86ba43
docs(29): create phase plan
SimplicityGuy May 16, 2026
ffdbf5f
test(29-01): add failing tests for cert_bootstrap + Postgres-free guard
SimplicityGuy May 16, 2026
5840bfe
feat(29-01): implement cert_bootstrap + entrypoint shim
SimplicityGuy May 16, 2026
57d9843
test(29-01): add TLS integration tests + fix CA chain extensions
SimplicityGuy May 16, 2026
25c4ca4
feat(29-01): wire verify= through PhazeAgentClient + AgentSettings
SimplicityGuy May 16, 2026
a5eeb56
docs(29-01): complete cert-bootstrap + agent-TLS plan
SimplicityGuy May 16, 2026
c625d22
chore(29-01): merge executor worktree (cert-bootstrap + TLS verify)
SimplicityGuy May 16, 2026
4b95029
test(29-02): add failing tests for AgentSettings agent_env + redis-pa…
SimplicityGuy May 16, 2026
a7741ff
feat(29-02): enforce passworded Redis URL on AgentSettings in product…
SimplicityGuy May 16, 2026
ffbd20b
docs(29-02): complete agent-side redis password validator plan
SimplicityGuy May 16, 2026
359b1a5
chore(29-02): merge executor worktree (redis password validator)
SimplicityGuy May 16, 2026
c560ee5
test(29-03): add failing YAML-parse tests for app-server compose isol…
SimplicityGuy May 16, 2026
149de70
feat(29-03): harden app-server compose — strip file mounts, lock down…
SimplicityGuy May 16, 2026
ca2a5c5
docs(29-03): complete app-server compose hardening plan
SimplicityGuy May 16, 2026
a79395d
chore(29-03): merge executor worktree (app-server compose hardening)
SimplicityGuy May 16, 2026
6800931
feat(29-05): add phaze.scripts.download_models helper + bash shim (D-21)
SimplicityGuy May 16, 2026
4ccd283
feat(29-05): wire ensure_models_present into agent_worker startup (D-21)
SimplicityGuy May 16, 2026
956fcf3
docs(29-05): complete models auto-download plan
SimplicityGuy May 16, 2026
ad71ab1
chore(29-05): merge executor worktree (models auto-download)
SimplicityGuy May 16, 2026
d385c5a
test(29-07): add failing tests for agent_liveness + humanize (RED)
SimplicityGuy May 16, 2026
e4bcf1b
feat(29-07): implement agent liveness classifier + humanize helper (G…
SimplicityGuy May 16, 2026
735b4e3
test(29-07): add failing tests for /admin/agents router (RED)
SimplicityGuy May 16, 2026
06cd701
feat(29-07): admin/agents router + templates + nav link (GREEN)
SimplicityGuy May 16, 2026
927ae3a
docs(29-07): complete admin/agents UI plan
SimplicityGuy May 16, 2026
a0ea852
chore(29-07): merge executor worktree (admin agents page)
SimplicityGuy May 16, 2026
a7a1270
fix(29-w1): update test_shared_agent_bootstrap for D-03 CA-file gate
SimplicityGuy May 16, 2026
768e77e
docs(phase-29): update tracking after wave 1
SimplicityGuy May 16, 2026
b1c5620
test(29-04): add failing YAML-parse tests for docker-compose.agent.ym…
SimplicityGuy May 16, 2026
48ad8c1
test(29-06): add failing tests for SAQ heartbeat cron handler
SimplicityGuy May 16, 2026
ae45925
feat(29-04): create docker-compose.agent.yml + .env.example.agent (GR…
SimplicityGuy May 16, 2026
0e78658
test(29-04): add failing workflow-tag check (WARNING-4 RED)
SimplicityGuy May 16, 2026
93e550b
feat(29-04): extend docker-publish.yml tag strategy + realign api URL…
SimplicityGuy May 16, 2026
8b206ca
docs(29-04): complete docker-compose.agent.yml + GHCR-tag verificatio…
SimplicityGuy May 16, 2026
afbf048
feat(29-06): wire SAQ 30s heartbeat cron handler (OPS-04 caller)
SimplicityGuy May 16, 2026
7b5da19
docs(29-06): summary for OPS-04 heartbeat caller plan
SimplicityGuy May 16, 2026
b785c1e
chore(29-04): merge executor worktree (agent compose template)
SimplicityGuy May 16, 2026
00d7924
chore(29-04): merge executor worktree (agent compose template)
SimplicityGuy May 16, 2026
c3e7b41
chore(29-06): merge executor worktree (heartbeat cron)
SimplicityGuy May 16, 2026
d60bde2
docs(phase-29): update tracking after wave 2
SimplicityGuy May 16, 2026
a7e9df3
docs(29-08): justfile recipes, deployment.md, PROJECT.md, update-proj…
SimplicityGuy May 17, 2026
10cd366
docs(29-08): close Task 2 with verified-docs-only signal (SUMMARY)
SimplicityGuy May 17, 2026
19fdc07
chore(29-08): merge executor worktree (docs + operator workflow)
SimplicityGuy May 17, 2026
3b37c30
docs(phase-29): update tracking after wave 3
SimplicityGuy May 17, 2026
0d309b6
test(29): persist human verification items as UAT
SimplicityGuy May 17, 2026
2fb533c
fix(29-cr-01,cr-02): production https:// guard + bind PHAZE_REDIS_URL…
SimplicityGuy May 17, 2026
970e19a
fix(29-cr-03): model_bootstrap rejects partial-download state
SimplicityGuy May 17, 2026
fa97307
docs(phase-29): complete phase execution
SimplicityGuy May 17, 2026
0555b1d
docs(milestone-v4.0): audit report — passed with documentation drift
SimplicityGuy May 17, 2026
2c4639a
docs(milestone-v4.0): close documentation drift surfaced by audit
SimplicityGuy May 17, 2026
d02cb3b
docs(phase-29): add REVIEW.md from gsd-code-reviewer audit
SimplicityGuy May 17, 2026
9429b1e
chore: archive v4.0 milestone files
SimplicityGuy May 17, 2026
d4c4d5c
chore: remove REQUIREMENTS.md for v4.0 milestone close
SimplicityGuy May 17, 2026
aa750eb
docs: update retrospective for v4.0 milestone
SimplicityGuy May 17, 2026
7724471
fix(ci): unbreak Phase 29 CI — 3 root causes
SimplicityGuy May 17, 2026
ff8d571
test(29): close codecov patch-coverage gaps on PR #63
SimplicityGuy May 17, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,24 @@ DEBUG=false
API_HOST=0.0.0.0
API_PORT=8000

# =====================================================================
# Phase 29: Redis hardening (D-05)
# =====================================================================
# Required password for redis-server --requirepass. Fresh dev clones can
# use the placeholder; production MUST set a strong unique value.
REDIS_PASSWORD=changeme
# Interface to bind redis :6379 on. Dev = loopback. Production = LAN IP
# (e.g., 192.168.1.10) so agents on other hosts can reach it.
REDIS_BIND_IP=127.0.0.1

# =====================================================================
# Phase 29: HTTPS via internal CA (D-02)
# =====================================================================
# Comma-separated SAN list for the auto-generated leaf cert. Defaults
# include `api` (docker compose service-name DNS) for single-host dev.
# Production should add the app-server's LAN hostname / IP.
PHAZE_API_TLS_SANS=localhost,127.0.0.1,api

# File discovery - mounted music directory for scanning
SCAN_PATH=/data/music

Expand Down
77 changes: 77 additions & 0 deletions .env.example.agent
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# Phaze file-server agent .env template (Phase 29 D-23)
#
# Copy this file to `.env` on the file-server host. The compose file
# (docker-compose.agent.yml) uses `${VAR:?msg}` fail-fast interpolation for
# required variables, so a missing value causes `docker compose up` to error
# at parse time -- you cannot accidentally bring up the stack misconfigured.

# =====================================================================
# Image tag (Phase 29 D-16)
# =====================================================================
# `latest` is the default for first-time setup. PRODUCTION OPERATORS SHOULD
# PIN to a specific version tag, e.g.:
# PHAZE_IMAGE_TAG=v4.0.0
# The docker-publish.yml workflow publishes both `:latest` and `:v<version>`
# tags on each tagged release, verified by an automated test in CI.
PHAZE_IMAGE_TAG=latest

# =====================================================================
# Application server URL (Phase 29 D-01)
# =====================================================================
# MUST be HTTPS -- the agent client refuses http:// URLs in production
# (AgentSettings._enforce_https_in_production guard, Phase 29 Plan 02).
# Replace <app-server-ip> with the app-server's LAN IP or hostname.
PHAZE_AGENT_API_URL=https://<app-server-ip>:8000

# =====================================================================
# Redis URL (Phase 29 AUTH-03)
# =====================================================================
# Phase 29 D-05 hardened the app-server Redis with --requirepass; agents
# MUST include the password in the URL. Production refuses passwordless
# redis_url (AgentSettings._enforce_redis_password_in_production).
# Replace <REDIS_PASSWORD> with the app-server's REDIS_PASSWORD value.
PHAZE_REDIS_URL=redis://default:<REDIS_PASSWORD>@<app-server-ip>:6379/0

# =====================================================================
# Agent identity (Phase 25 AUTH-01)
# =====================================================================
# Provisioned via psql on the app-server. The token's sha256 hash is
# stored in the `agents` table; this file holds the cleartext bearer.
# Token format: phaze_agent_<32 urlsafe-base64 bytes>.
PHAZE_AGENT_ID=fileserver-east
PHAZE_AGENT_TOKEN=phaze_agent_<32urlsafe>
PHAZE_AGENT_QUEUE=phaze-agent-fileserver-east

# =====================================================================
# CA cert (Phase 29 D-03)
# =====================================================================
# Path INSIDE the container (the bind-mount at $CA_PATH:/certs:ro makes
# the operator-copied CA cert available). After `phaze.cert_bootstrap`
# runs on the app-server, scp ./certs/phaze-ca.crt from the app-server
# to this host and place at $CA_PATH on the file-server.
PHAZE_AGENT_CA_FILE=/certs/phaze-ca.crt

# =====================================================================
# Environment posture (Phase 29 D-06)
# =====================================================================
# `production` triggers the agent-side hardening guards (TLS enforcement,
# Redis password enforcement). Use `development` only on a local test rig.
PHAZE_AGENT_ENV=production

# =====================================================================
# File-server local paths
# =====================================================================
# Music files to scan. REQUIRED -- docker compose up fails if unset.
SCAN_PATH=/data/music
# Essentia model weights. Auto-downloaded on first start by the worker
# (D-21). rw so the worker container can populate this on first boot.
MODELS_PATH=./models
# Operator-distributed CA cert directory. ro inside the container.
CA_PATH=./certs

# =====================================================================
# Scan roots (Phase 27 watcher / Phase 25 path traversal containment)
# =====================================================================
# Comma-separated list of absolute filesystem paths the agent is permitted
# to read/write. Used by execute_approved_batch for path-traversal containment.
PHAZE_AGENT_SCAN_ROOTS=/data/music,/data/concerts
19 changes: 18 additions & 1 deletion .github/workflows/docker-publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,18 +24,25 @@ jobs:
strategy:
matrix:
include:
# Phase 29 D-15 / D-16: the api image is pulled by docker-compose.agent.yml's
# worker + watcher services via the bare-repo URL
# ghcr.io/simplicityguy/phaze:<tag> (no /api sub-path), so we override
# image_suffix to "" for api and keep the sub-path for the sidecars.
- name: api
dockerfile: Dockerfile
context: .
use_cache: true
image_suffix: ""
- name: audfprint
dockerfile: services/audfprint/Dockerfile.audfprint
context: .
use_cache: true
image_suffix: "/audfprint"
- name: panako
dockerfile: services/panako/Dockerfile.panako
context: .
use_cache: true
image_suffix: "/panako"

steps:
- name: "\u23F1\uFE0F Start timer"
Expand Down Expand Up @@ -90,9 +97,19 @@ jobs:
id: meta
uses: docker/metadata-action@v6
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}/${{ matrix.name }}
# Phase 29 D-15: matrix.image_suffix is "" for the api image (bare-repo
# URL ghcr.io/simplicityguy/phaze pulled by docker-compose.agent.yml)
# and "/<name>" for the sidecars. This keeps the api URL aligned with
# the agent.yml image: line.
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}${{ matrix.image_suffix }}
# Phase 29 D-16 + WARNING-4: tag strategy is verified by
# tests/test_deployment/test_agent_compose.py::test_docker_publish_workflow_tags_both_latest_and_version
# which asserts BOTH a `:latest` and a `:v<version>` tag are produced.
tags: |
type=raw,value=latest,enable={{is_default_branch}}
type=semver,pattern={{version}}
type=semver,pattern={{major}}.{{minor}}
type=ref,event=tag
type=ref,event=branch
type=ref,event=pr
type=schedule,pattern={{date 'YYYYMMDD'}}
Expand Down
27 changes: 24 additions & 3 deletions .github/workflows/docker-validate.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,28 @@ jobs:

- name: 🐳 Validate docker-compose.yml
run: |
echo "🔍 Validating Docker Compose configuration..."
touch .env
docker compose config --quiet
echo "🔍 Validating application-server docker-compose.yml..."
# Phase 29 added fail-fast ${VAR:?} guards on REDIS_PASSWORD.
# Supply placeholders so compose-parse can resolve interpolation.
printf '%s\n' \
'REDIS_PASSWORD=ci-validate-placeholder' \
'REDIS_BIND_IP=127.0.0.1' \
> .env
docker compose -f docker-compose.yml config --quiet
echo "✅ docker-compose.yml is valid"

- name: 🐳 Validate docker-compose.agent.yml
run: |
echo "🔍 Validating agent (file-server) docker-compose.agent.yml..."
# Phase 29 file-server compose requires SCAN_PATH on all 4 services;
# supply placeholders so compose-parse can resolve interpolation.
printf '%s\n' \
'SCAN_PATH=/tmp/phaze-ci-scan-placeholder' \
'MODELS_PATH=/tmp/phaze-ci-models-placeholder' \
'PHAZE_API_URL=https://app-server.example:8000' \
'PHAZE_REDIS_URL=redis://default:ci-placeholder@app-server.example:6379/0' \
'PHAZE_AGENT_TOKEN=phaze_agent_ci-placeholder' \
'PHAZE_AGENT_ID=ci-agent' \
> .env.agent
docker compose -f docker-compose.agent.yml --env-file .env.agent config --quiet
echo "✅ docker-compose.agent.yml is valid"
29 changes: 29 additions & 0 deletions .planning/MILESTONES.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,34 @@
# Milestones

## v4.0 Distributed Agents (Shipped: 2026-05-17)

**Phases completed:** 6 phases, 47 plans

**Delivered:** Phaze is now a two-host system — an application-server control plane (API, UI, Postgres, Redis, fileless workers; no file mounts) and one or more file-server agents that own music/video files locally, pull jobs from per-agent SAQ queues, and write every state change back over authenticated HTTPS.

**Key accomplishments:**

- `agents` table + `agent_id` columns on FileRecord/ScanBatch, two-step Alembic migration (012 add+backfill, 013 NOT NULL+UQ swap) with `legacy-application-server` seed preserving v3.0 corpus end-to-end
- Internal `/api/internal/agent/*` HTTP surface (files, metadata, fingerprint, analysis, tracklists, proposals, execution-log, scan-batches, exec-batches, heartbeat, whoami) with token-hash auth deriving `agent_id` from bearer token — never from request body — and 403-before-state-machine cross-tenant guard on every multi-tenant route
- Idempotent natural-key upserts across the agent surface: `(agent_id, original_path)`, `file_id`, `proposal_id`, agent-generated log UUIDs; replays produce zero duplicate rows and zero same-state DB writes
- Task code split: `phaze.tasks.controller` (fileless: generate_proposals, tracklist scrapers, refresh cron) vs `phaze.tasks.agent_worker` (file-bound: process_file, extract_file_metadata, fingerprint_file, scan_live_set, execute_approved_batch); subprocess import-boundary test enforces no `phaze.database` in the agent chain
- `PHAZE_ROLE={control,agent}` env-driven settings split (ControlSettings vs AgentSettings via `get_settings()` factory); same Docker image for both roles; per-agent SAQ queue (`phaze-agent-<id>`); AgentTaskRouter picks queue from `FileRecord.agent_id`
- `PhazeAgentClient` with tenacity retry funnel, 4-class error hierarchy, bearer token never stored as instance attribute (lives only in httpx headers); respx contract tests across all routes
- `phaze-agent-watcher` service: watchdog observer + asyncio-owned single-loop sweep, mtime settle (10s default) + stuck-file cap (3600s); LIVE-sentinel ScanBatch per agent; admin "Trigger Scan" form with HTMX agent-roots swap + 2s/5s polling partials
- `scan_directory` agent task with chunked HTTP upserts (500/chunk), per-chunk PATCH progress, terminal PATCH; same `/files` endpoint serves bulk scans and per-file watcher events
- Distributed execution dispatch: group-by-`FileRecord.agent_id` (in-Python `defaultdict`), one `execute_approved_batch` sub-job per affected agent under shared parent `batch_id`; per-proposal terminal progress POST; SAQ-meta UUID lift for retry-safe `execution_log_id` and `progress_request_id`
- Unified SSE progress aggregating across agents (3 Jinja partials rendered via `_render_partial()` for Semgrep XSS compliance); per-agent breakdown table; revoked-agent banner
- Per-file-server fingerprint sidecars (audfprint + panako allow-list validator blocks non-localhost URLs at config load); cross-file-server fingerprint matching documented as v4.0 limitation with dismissible banner on Duplicate Resolution page
- Self-signed internal CA + leaf x509 generated on first start in the api container via `phaze.cert_bootstrap` + pre-uvicorn entrypoint shim (signals/PID-1 propagate cleanly); `PhazeAgentClient` honors `verify=` kwarg defaulting to `AgentSettings.agent_ca_file`; wrong-CA → ConnectError integration test
- Redis hardening: `requirepass` + `${REDIS_BIND_IP:-127.0.0.1}` LAN bind on app-server compose; `AgentSettings` rejects passwordless `redis_url` at boot when `PHAZE_AGENT_ENV=production`
- Application-server `docker-compose.yml` stripped of `SCAN_PATH`/`MODELS_PATH` mounts and watcher/audfprint/panako services; YAML-parse tests enforce filesystem isolation
- New `docker-compose.agent.yml` (4 services: worker, watcher, audfprint, panako) + `.env.example.agent`; `${SCAN_PATH:?...}` fail-fast on misconfigured file-server hosts; docker-publish.yml extended for both compose-file image tags
- `phaze.scripts.download_models` Python helper + `phaze.tasks._shared.model_bootstrap` wired into agent_worker/watcher startup (rejects partial-download `.part` state); `just download-models` populates per-file-server `/models` volume
- 30-second SAQ CronJob heartbeat from each agent updating `agents.last_seen_at`; Agents admin page (`/admin/agents`) with liveness classifier (alive/stale/revoked), queue depth, last-seen humanize helper; HTMX 5s auto-refresh
- Operator workflow: `just up` (app-server), `just up-agent` (each file-server), `just up-all` (single-host dev); full deployment walkthrough in `docs/deployment.md`; PROJECT.md Constraints + Deployment subsections updated

---

## v3.0 Cross-Service Intelligence & File Enrichment (Shipped: 2026-04-04)

**Phases completed:** 4 phases, 11 plans, 22 tasks
Expand Down
Loading
Loading