Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 12 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,8 @@ Do not publish specific throughput, latency, cache-hit, Kafka-lag, or availabili
- Evidence guide: `docs/article-evidence.md`
- Observability map: `docs/observability.md`
- Benchmark report template: `docs/benchmarks/YYYY-MM-DD-pulseops-benchmark.md`
- Clean full local benchmark: `docs/benchmarks/2026-06-16-clean-full-benchmark.md`
- Clean publish local benchmark: `docs/benchmarks/2026-06-16-clean-publish-benchmark.md`
- Full local benchmark evidence: `docs/benchmarks/2026-06-16-clean-full-benchmark.md` (records `Dirty tree | yes`; rerun after committing before citing final article numbers)
- Canonical local smoke report: `docs/benchmarks/2026-06-16-final-benchmark-smoke-pulseops-benchmark.md`
- Heavier ingest-scale report: `docs/benchmarks/2026-06-16-ingest-scale-pulseops-benchmark.md`
- Synthetic skew generator: `scripts/generate-skewed-events.ts`
Expand Down Expand Up @@ -170,9 +171,13 @@ pnpm test:e2e # Playwright

### Load Testing
```bash
# Publishable local evidence must start from a clean tree.
git status --short

pnpm --silent benchmark:generate -- --tenants 100 --events 100000 --days 30 --hot-tenant-ratio 0.6 --late-arrival-ratio 0.05 --duplicate-ratio 0.01 --output jsonl > docs/benchmarks/evidence/events.jsonl
RUN_ID=local-smoke API_URL=http://localhost:3001 GRAPHQL_URL=http://localhost:3002/graphql API_KEY=demo_key_change_this pnpm benchmark
pnpm benchmark:report -- --run-id local-smoke --output docs/benchmarks/local-smoke-pulseops-benchmark.md
RUN_ID=local-smoke pnpm query-plans:capture
pnpm benchmark:report -- --run-id local-smoke --output docs/benchmarks/local-smoke-pulseops-benchmark.md --force
RUN_ID=local-smoke pnpm validate:evidence # writes docs/benchmarks/latest-pulseops-benchmark.md
pnpm db:verify:fresh
API_URL=http://localhost:3001 API_KEY=demo_key_change_this pnpm benchmark:ingest
Expand All @@ -194,14 +199,14 @@ These are benchmark targets and measurement areas, not measured claims.

| Metric | Status | Notes |
|--------|--------|-------|
| Ingest throughput | Measured locally | See `docs/benchmarks/2026-06-16-clean-full-benchmark.md` for the clean full local run, and `docs/benchmarks/2026-06-16-ingest-scale-pulseops-benchmark.md` for the heavier fixed-rate ingest runs. The 1000 RPS target was not sustained locally. |
| Ingest p95 latency | Measured locally | See dated benchmark reports; request acceptance latency is not aggregate visibility latency |
| Dashboard query p95 | Measured locally | See canonical smoke report; includes k6 dashboard smoke and cold/warm cache smoke |
| Worker catch-up | Measured locally | 200-event local smoke run; see canonical smoke report |
| Ingest throughput | Measured locally | Use `docs/benchmarks/2026-06-16-clean-publish-benchmark.md` for clean-tree article numbers. `docs/benchmarks/2026-06-16-ingest-scale-pulseops-benchmark.md` remains dirty-tree stress evidence. The 1000 RPS target was not sustained locally. |
| Ingest p95 latency | Measured locally | See dated benchmark reports; request acceptance latency is not aggregate visibility latency. |
| Dashboard query p95 | Measured locally | See dated reports; cache smoke is cold-vs-warm local evidence, not a production cache-hit-ratio benchmark. |
| Worker catch-up | Measured locally | 200-event bounded local smoke run; cite the worker catch-up evidence file for the exact run ID. |
| Kafka lag | Measured locally | Smoke run returned lag to 0; heavier ingest-scale snapshot captured 10,254,305 queued messages. Do not claim a lag limit or freshness guarantee. |
| Tenant skew impact | Smoke measured locally | Canonical local smoke reconciled 249 persisted hot-test events with Kafka lag 0: hot 201, quiet 40, medium 8. Evidence: `docs/benchmarks/evidence/hot-tenant-db-2026-06-16-final-benchmark-smoke.json`; full long-duration skew benchmark still needed |
| Hot-tenant DB pressure | Measured locally when `benchmark:hot-db -- --require-complete` is run | Aggregate-key pressure, request/persistence/lag reconciliation, and after-run DB snapshot; not continuous lock sampling |
| Backpressure behavior | TBD | Record rate limits, errors, queue lag, and recovery |
| Backpressure behavior | k6 load-script evidence only | Correlate with Kafka lag, worker catch-up, and DB metrics before making stronger backpressure claims. |

## Deployment

Expand Down
388 changes: 191 additions & 197 deletions docs/article-evidence.md

Large diffs are not rendered by default.

6 changes: 5 additions & 1 deletion docs/benchmarks/2026-06-16-clean-full-benchmark.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
# PulseOps Benchmark Report: 2026-06-16

Status: evidence-backed local report for run ID `2026-06-16-clean-full-benchmark`; not production-scale
Status: evidence-backed local report for run ID `2026-06-16-clean-full-benchmark`; dirty-tree evidence; not production-scale

Publishability: not final publishable article evidence because this report records `Dirty tree | yes`. Use it for review and methodology, then rerun from a clean commit before citing final numbers publicly.

## Environment

Expand All @@ -21,6 +23,8 @@ Status: evidence-backed local report for run ID `2026-06-16-clean-full-benchmark
| k6 version | k6 v2.0.0+dirty (commit/8c3be52cc1-dirty, go1.26.3, linux/arm64) (Docker fallback image grafana/k6:2.0.0) |
| Dataset | local Docker dataset at report generation time |

The dashboard cache measurement in this run used the default demo org/project, while the hot-tenant and run-scoped raw-event query plans target a seeded benchmark org/project. Treat the cache row as same-run local cache-path evidence, not as the hot tenant's cache latency.

## Commands

```bash
Expand Down
186 changes: 186 additions & 0 deletions docs/benchmarks/2026-06-16-clean-publish-benchmark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
# PulseOps Benchmark Report: 2026-06-16

Status: evidence-backed local report for run ID `2026-06-16-clean-publish-benchmark`; not production-scale

Publishability: candidate publishable local evidence; still not production-scale

## Environment

| Field | Value |
| --- | --- |
| Git commit | `63f9556cefad9548774c0eca17b01e558eda3d87` |
| Dirty tree | no |
| Dirty tree details | none |
| Machine | Apple M4 Pro, 12 logical CPUs, 24.00 GiB host memory |
| Docker resources | 12 CPUs, 7.65 GiB |
| OS | Darwin 25.5.0 arm64 |
| Node.js version | v25.3.0 |
| PostgreSQL version | 16.13 |
| Redis version | v=7.4.8 |
| Kafka version | 4.2.0 |
| PostgreSQL row count at report capture | 18376 raw events |
| Daily aggregate row count at report capture | 630 rows |
| Event partitions at report capture | 7 child partitions |
| k6 version | k6 v2.0.0+dirty (commit/8c3be52cc1-dirty, go1.26.3, linux/arm64) (Docker fallback image grafana/k6:2.0.0) |
| Dataset | local Docker dataset at report generation time |

Environment values come from pre-run metadata when available. PostgreSQL row counts are captured when this report is generated, after the benchmark and query-plan capture.

## Run Provenance

| Field | Value |
| --- | --- |
| Metadata file | `docs/benchmarks/evidence/run-metadata-2026-06-16-clean-publish-benchmark.json` |
| Run started | 2026-06-16T21:22:27.830Z |
| Run completed | 2026-06-16T21:24:13.432Z |
| Run status | completed |
| Branch at run start | `feat/publish-safe-evidence` |
| Suites requested | ingest, hot, hotDb, dashboard, cache, worker, backpressure |
| Suites completed | ingest, hot, hotDb, dashboard, cache, worker, backpressure |

### Dirty Tree Details

```text
none
```

### Recorded Environment Overrides

| Name | Value |
| --- | --- |
| `API_URL` | `http://localhost:3001` |
| `BATCH_SIZE` | `20` |
| `BURST_HOLD` | `15s` |
| `BURST_RAMP` | `5s` |
| `BURST_RATE` | `20` |
| `DURATION` | `20s` |
| `EVENTS` | `200` |
| `GRAPHQL_URL` | `http://localhost:3002/graphql` |
| `HOLD_DURATION` | `15s` |
| `MAX_VUS` | `100` |
| `ORG_ID` | `00000000-0000-4000-8000-0000000f4241` |
| `PEAK_RATE` | `20` |
| `POLL_MS` | `500` |
| `PREALLOCATED_VUS` | `30` |
| `PROJECT_ID` | `00000000-0000-4000-8000-0000001e8481` |
| `RAMP_DOWN_DURATION` | `5s` |
| `RAMP_DURATION` | `5s` |
| `RATE` | `20` |
| `RECOVERY` | `10s` |
| `RECOVERY_RATE` | `5` |
| `SLEEP_SECONDS` | `0` |
| `START_RATE` | `5` |
| `TENANT_KEYS_FILE` | `tmp/clean-publish-benchmark-tenants.json` |
| `TIMEOUT_MS` | `120000` |
| `VUS` | `10` |
| `WARM_ITERATIONS` | `10` |

### Recorded Suite Commands

| Suite | Command |
| --- | --- |
| ingest | `node scripts/run-k6.js tests/load/ingest-throughput.js` |
| hot | `node scripts/run-k6.js tests/load/hot-tenant.js` |
| hotDb | `pnpm exec tsx scripts/measure-hot-tenant-db.ts` |
| dashboard | `node scripts/run-k6.js tests/load/dashboard-query.js` |
| cache | `pnpm exec tsx scripts/measure-dashboard-cache.ts` |
| worker | `pnpm exec tsx scripts/measure-worker-catchup.ts` |
| backpressure | `node scripts/run-k6.js tests/load/backpressure.js` |

## Commands

```bash
# Command matching the run-specific evidence files currently present in this report:
RUN_ID=2026-06-16-clean-publish-benchmark pnpm benchmark
RUN_ID=2026-06-16-clean-publish-benchmark pnpm benchmark:report -- --run-id 2026-06-16-clean-publish-benchmark --force

# Full-suite command, if you want every row populated:
RUN_ID=2026-06-16-clean-publish-benchmark pnpm benchmark
```

Run-specific evidence files found for this report: ingest, hot, hotDb, dashboard, cache, worker, backpressure.
If only part of the suite was run, missing evidence stays marked as `not found` below.

## Results

| Test | Command | Throughput | p50 latency | p95 latency | p99 latency | Error rate | Kafka lag | DB notes | Result |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| Ingest throughput | `pnpm benchmark:ingest` | 20.00 req/s | 3.34 ms | 7.19 ms | 10.55 ms | 0.00% | not measured by this k6 row | 400 requests | Measured; docs/benchmarks/evidence/ingest-throughput-2026-06-16-clean-publish-benchmark.json |
| Hot tenant | `pnpm benchmark:hot-tenant` | 16.99 req/s | 3.55 ms | 6.69 ms | 9.32 ms | 0.00% | not measured by this k6 row | 425 requests | Measured; docs/benchmarks/evidence/hot-tenant-2026-06-16-clean-publish-benchmark.json |
| Hot tenant DB evidence | `pnpm benchmark:hot-db` | 425 persisted hot-test events | n/a | n/a | n/a | 0 unmatched requests | 0 | hot raw count 0.11 ms; quiet raw count 0.06 ms; 0 waiting locks at snapshot; hot 322/425; max hot events/key 257 | Measured; docs/benchmarks/evidence/hot-tenant-db-2026-06-16-clean-publish-benchmark.json |
| Dashboard query | `pnpm benchmark:dashboard` | 2119.62 req/s | 4.24 ms | 6.60 ms | 11.48 ms | 0.00% | not measured by this k6 row | 42399 requests | Measured; docs/benchmarks/evidence/dashboard-query-2026-06-16-clean-publish-benchmark.json |
| Dashboard cache | `pnpm benchmark:cache` | n/a | 1.56 ms | 3.05 ms | not captured | 0 GraphQL errors | n/a | cold 30.97 ms, 10 warm iterations | Measured; docs/benchmarks/evidence/dashboard-cache-2026-06-16-clean-publish-benchmark.json |
| Worker catch-up | `pnpm benchmark:worker` | 111.90 persisted events/s | n/a | n/a | n/a | 0 lost in run | 0 | 200 accepted / 200 persisted | Measured; docs/benchmarks/evidence/worker-catchup-2026-06-16-clean-publish-benchmark.json |
| Backpressure | `pnpm benchmark:backpressure` | 16.28 req/s | 3.08 ms | 4.78 ms | 6.83 ms | 0.00% | not measured by this k6 row | 487 requests | Measured; docs/benchmarks/evidence/backpressure-2026-06-16-clean-publish-benchmark.json |

## Run-Scoped Query Plans

| Query | Plan file | Observation |
| --- | --- | --- |
| clean-publish-benchmark-aggregate-daily-dashboard | `docs/query-plans/2026-06-16-clean-publish-benchmark-aggregate-daily-dashboard.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-graphql-cache-path | `docs/query-plans/2026-06-16-clean-publish-benchmark-graphql-cache-path.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-materialized-dashboard | `docs/query-plans/2026-06-16-clean-publish-benchmark-materialized-dashboard.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-partition-pruning-24h | `docs/query-plans/2026-06-16-clean-publish-benchmark-partition-pruning-24h.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-partition-pruning-30d | `docs/query-plans/2026-06-16-clean-publish-benchmark-partition-pruning-30d.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-tenant-dashboard-chosen-index | `docs/query-plans/2026-06-16-clean-publish-benchmark-tenant-dashboard-chosen-index.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |
| clean-publish-benchmark-tenant-dashboard-index-disabled | `docs/query-plans/2026-06-16-clean-publish-benchmark-tenant-dashboard-index-disabled.md` | Captured for run ID 2026-06-16-clean-publish-benchmark; read file for row counts and interpretation |

## Reference Query Plans

These saved EXPLAIN ANALYZE files are repository evidence, not generated by this benchmark report unless they explicitly mention run ID `2026-06-16-clean-publish-benchmark`.

| Query | Plan file | Observation |
| --- | --- | --- |
| aggregate-daily-dashboard | `docs/query-plans/2026-06-16-aggregate-daily-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-aggregate-daily-dashboard | `docs/query-plans/2026-06-16-clean-full-benchmark-aggregate-daily-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-graphql-cache-path | `docs/query-plans/2026-06-16-clean-full-benchmark-graphql-cache-path.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-materialized-dashboard | `docs/query-plans/2026-06-16-clean-full-benchmark-materialized-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-partition-pruning-24h | `docs/query-plans/2026-06-16-clean-full-benchmark-partition-pruning-24h.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-partition-pruning-30d | `docs/query-plans/2026-06-16-clean-full-benchmark-partition-pruning-30d.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-tenant-dashboard-chosen-index | `docs/query-plans/2026-06-16-clean-full-benchmark-tenant-dashboard-chosen-index.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| clean-full-benchmark-tenant-dashboard-index-disabled | `docs/query-plans/2026-06-16-clean-full-benchmark-tenant-dashboard-index-disabled.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-aggregate-daily-dashboard | `docs/query-plans/2026-06-16-final-benchmark-smoke-aggregate-daily-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-graphql-cache-path | `docs/query-plans/2026-06-16-final-benchmark-smoke-graphql-cache-path.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-materialized-dashboard | `docs/query-plans/2026-06-16-final-benchmark-smoke-materialized-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-partition-pruning-24h | `docs/query-plans/2026-06-16-final-benchmark-smoke-partition-pruning-24h.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-partition-pruning-30d | `docs/query-plans/2026-06-16-final-benchmark-smoke-partition-pruning-30d.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-tenant-dashboard-chosen-index | `docs/query-plans/2026-06-16-final-benchmark-smoke-tenant-dashboard-chosen-index.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| final-benchmark-smoke-tenant-dashboard-index-disabled | `docs/query-plans/2026-06-16-final-benchmark-smoke-tenant-dashboard-index-disabled.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| materialized-dashboard | `docs/query-plans/2026-06-16-materialized-dashboard.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| partition-pruning-24h | `docs/query-plans/2026-06-16-partition-pruning-24h.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| partition-pruning-30d | `docs/query-plans/2026-06-16-partition-pruning-30d.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| tenant-dashboard-chosen-index | `docs/query-plans/2026-06-16-tenant-dashboard-chosen-index.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |
| tenant-dashboard-index-disabled | `docs/query-plans/2026-06-16-tenant-dashboard-index-disabled.md` | Reference EXPLAIN ANALYZE evidence; cite separately from this benchmark run |

## Evidence Files

| File | Description |
| --- | --- |
| `docs/benchmarks/evidence/run-metadata-2026-06-16-clean-publish-benchmark.json` | Pre-run benchmark metadata JSON |
| `docs/benchmarks/evidence/ingest-throughput-2026-06-16-clean-publish-benchmark.json` | Raw k6 ingest summary JSON |
| `docs/benchmarks/evidence/hot-tenant-2026-06-16-clean-publish-benchmark.json` | Raw k6 hot-tenant summary JSON |
| `docs/benchmarks/evidence/hot-tenant-db-2026-06-16-clean-publish-benchmark.json` | Hot-tenant PostgreSQL evidence JSON |
| `docs/benchmarks/evidence/dashboard-query-2026-06-16-clean-publish-benchmark.json` | Raw k6 dashboard-query summary JSON |
| `docs/benchmarks/evidence/dashboard-cache-2026-06-16-clean-publish-benchmark.json` | Cold/warm GraphQL cache JSON measurement |
| `docs/benchmarks/evidence/worker-catchup-2026-06-16-clean-publish-benchmark.json` | Worker catch-up JSON measurement |
| `docs/benchmarks/evidence/backpressure-2026-06-16-clean-publish-benchmark.json` | Raw k6 backpressure summary JSON |

## Claims Allowed From This Run

- The numbers in the table are local measurements for run ID `2026-06-16-clean-publish-benchmark` only.
- Kafka decoupling can be discussed when ingest acceptance and worker catch-up or lag evidence are both present.
- Cache claims are limited to the cold/warm GraphQL measurement if the dashboard cache evidence file exists.
- Worker throughput claims are limited to the bounded worker catch-up workload if the worker evidence file exists.
- Hot-tenant database claims are limited to the aggregate-key pressure, representative EXPLAIN timings, reconciliation status, and after-run PostgreSQL snapshot in the hot-tenant DB evidence file if present.
- Query plan claims from this run require run-scoped files above. Otherwise cite the reference query-plan files separately.
- Treat this report as article-ready only if `Dirty tree` is `no`, run metadata status is `completed`, every requested suite is completed, run-scoped query plans are listed, and every cited number comes from this run ID.

## Claims Not Supported By This Run

- Do not claim production scale, production readiness, or a fixed capacity limit.
- Do not extrapolate beyond the exact workload, machine, Docker resources, and dataset above.
- Do not claim long-duration or million-event tenant-skew behavior unless that evidence file is present.
- Do not claim realistic cache hit ratio from a cold/warm smoke measurement.
- Do not claim Kafka lag limits beyond the captured lag evidence; this run's worker final lag was 0.
- Do not claim final publishable benchmark evidence from this report if `Dirty tree` is `yes`, run metadata is missing/incomplete, or run-scoped query plans are missing.
- The fallback k6 runner is pinned to `grafana/k6:2.0.0`; record a new exact version if you override it or use a local k6 binary.
Loading
Loading