From 4a278ad17b465a44378d25aba548d9ff22df9d3b Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 5 Jun 2026 13:18:53 +0000 Subject: [PATCH 1/2] docs: add CLAUDE.md proposals for 5 repos with recent merged PRs Proposals generated from review of merged PRs by SayaliPat, shrivastavakapil2000, JoeVsVolcano, mike-brant, and nathan-resonate. Each file documents what to add or create as CLAUDE.md in the corresponding repo: - step-function-workflow-orchestrator: decommissioned pipelines (fusion-behavior-preprocess, cookiejar-sample-export), EMR 7.12/Spark 3 migration (7 pipelines), geo-location changes - batch-expression-modeling: BlockGraph delivery config, formatter path layout change, formatter metrics lambda, batch stitch throttling (CDP-118913/118857/118972) - identity-graph: new CLAUDE.md for PRISM identity system (11 Spark jobs + prism_dbt v1.0) - kshrivastava: new CLAUDE.md for V3 multi-agent orchestration framework (Temporal + roles + skills) - dos-data-pipeline: new CLAUDE.md for geo-location ETL (district_source, 4-CSV mapping, backfill rewrite) https://claude.ai/code/session_014wxhfqnUnnc6kcDYEgWDx3 --- .../batch-expression-modeling.CLAUDE.md | 50 +++++ .../dos-data-pipeline.CLAUDE.md | 112 +++++++++++ claude-md-proposals/identity-graph.CLAUDE.md | 181 ++++++++++++++++++ claude-md-proposals/kshrivastava.CLAUDE.md | 117 +++++++++++ ...p-function-workflow-orchestrator.CLAUDE.md | 51 +++++ 5 files changed, 511 insertions(+) create mode 100644 claude-md-proposals/batch-expression-modeling.CLAUDE.md create mode 100644 claude-md-proposals/dos-data-pipeline.CLAUDE.md create mode 100644 claude-md-proposals/identity-graph.CLAUDE.md create mode 100644 claude-md-proposals/kshrivastava.CLAUDE.md create mode 100644 claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md diff --git a/claude-md-proposals/batch-expression-modeling.CLAUDE.md b/claude-md-proposals/batch-expression-modeling.CLAUDE.md new file mode 100644 index 0000000..b5019df --- /dev/null +++ b/claude-md-proposals/batch-expression-modeling.CLAUDE.md @@ -0,0 +1,50 @@ +# CLAUDE.md update — resonate/batch-expression-modeling +# Append the following section to the END of the existing CLAUDE.md + +--- + +## Vendor Configuration (`vendor-config.ini`) — Recent Additions + +File: `workflows/lambdas/batch-audience-delivery-config/vendor-config.ini` + +### New Config Keys + +| Key | Default | Purpose | +|---|---|---| +| `stitch_columns` | — | Comma-separated multi-column stitch (use instead of `stitch_column` for vendors that join on multiple fields, e.g. BlockGraph address). | +| `audience_bitmap_path` | `tagav` | Bitmap source for audience evaluation. Set to `person_jar` for person-keyed (RID) vendors like BlockGraph. | + +All existing single-column vendors continue to use `stitch_column` (singular). `get_audience_bitmap_path(delivery_type)` returns the configured source (default `tagav`). + +### BlockGraph Delivery (CDP-118913) + +BlockGraph is **person-keyed (RID)**, not cookie-keyed (RCID). Two new sections in `vendor-config.ini`: + +- `[blockgraph_syndicated]` — syndication delivery via normalized RID→ADDRESS table +- `[blockgraph_custom]` — custom delivery, same stitch shape + +Key differences from cookie-keyed vendors: +- `stitch_columns = norm_address_line,norm_city,norm_state,norm_zip,zip_plus4` — normalized address stitch (5 columns) +- `stitch_table_name = person_identity_graph_beta` — beta RID→ADDRESS table +- `audience_bitmap_path = person_jar` — evaluates audiences against the personJar bitmap (not tagav) +- `person_jar_path` is supplied per-run in the delivery event (`delivery_config.person_jar_path`) + +### Formatter Output Path Layout Change (CDP-118857 / CDP-118937) + +The batch-delivery-formatter flipped partition order from `date=*/method=av/vendor=*/akey=*` to `date=*/vendor=*/method=*/akey=*`. All downstream consumer state machines were updated accordingly: +- `batch-audience-custom-delivery` (#256) +- `batch-audience-delivery-syndication` (yahoo, experian, openx/viant variants) + +If you see `No files were able to be copied` errors after a formatter run, check that the consumer ASL's `source_prefix` uses `vendor=*/method=*` order. + +### Formatter Metrics Lambda (CDP-118857) + +`batch-delivery-formatter-pipeline` now invokes `batch-delivery-formatter-publish-metrics` Lambda after each successful EMR run. Lambda ARN injected via `PublishMetricsFunctionArn` in each pipeline's `terragrunt.hcl`. + +Metrics published to InfluxDB: +- `com.resonate.delivery.format.count` — per-audience row counts +- `com.resonate.delivery.format.aggregate` — aggregate metrics (vendor + method + delivery_type tags, no audience_key) + +### Batch Stitch Throttling (CDP-118972) + +`batch-stitch-pipeline` uses `MaxConcurrency=2` on the stitch Map state to prevent bursting `AddJobFlowSteps` API calls. A stagger wait of `(Map.Item.Index × wait_seconds)` is inserted before each step submission. diff --git a/claude-md-proposals/dos-data-pipeline.CLAUDE.md b/claude-md-proposals/dos-data-pipeline.CLAUDE.md new file mode 100644 index 0000000..09b42f0 --- /dev/null +++ b/claude-md-proposals/dos-data-pipeline.CLAUDE.md @@ -0,0 +1,112 @@ +# CLAUDE.md — resonate/dos-data-pipeline +# This is a NEW file to be created at the root of the dos-data-pipeline repo + +# dos-data-pipeline + +## Project Purpose and Architecture Overview + +This repo contains the **DOS (Data Operations Systems) data pipeline** — Scala/Spark ETL jobs and configuration for geo-location enrichment, segment aggregation, and political district bitmap preparation for Resonate's audience platform. + +**Key pipelines:** +- **geo-location** — Daily enrichment of RCIDs with geographic + political district data (congress, state-senate, state-house); full backfill for historical correction. Feeds `ToBitmap` which sets marker bits. +- **segment-aggregator** — Aggregates behavioral segment data. + +**Deployment:** GitHub Actions. Config in this repo deploys directly. Other pipelines are activated by `datapipeline-trigger` in `lambda-modeling`; their config comes from `services-api-configuration`. + +--- + +## Repository Structure + +``` +src/main/scala/com/resonate/dos/ + GeoLocationDaily.scala # Daily geo enrichment: IP → zip → district + district_source + GeoLocationFullBackfill.scala # Backfills historical dates (daily-on-multi-day-pixels approach) + GeoLocationFull.scala # Merges daily+backfill into rolling-full + ToBitmap.scala # Converts geo-location output to bitmap; gates L2-confirmed bit + DistrictResolver.scala # Centralizes L2-vs-IP district coalescing + district_source derivation + +src/test/scala/com/resonate/dos/ + GeoLocationDailyTest.scala + GeoLocationFullBackfillTest.scala + ToBitmapTest.scala + ... + +config/geo-location/ + zip-district-mappings/ # 4 CSV files (one per district type) + zip-congress-mapping.csv # 34,785 rows — L2 mode-per-zip, all 51 states + zip-proposed-congress-mapping.csv # 8,362 rows — 2026 redistricted (CA/MO/NC/OH/TX/UT/VA) + zip-state-senate-mapping.csv # 33,527 rows + zip-state-house-mapping.csv # 32,620 rows +``` + +**Scala version:** 2.11.8 | **Spark version:** 2.4.3 (Spark 2 — not yet migrated to Spark 3) + +--- + +## Key Commands + +```bash +# Run all unit tests +sbt test + +# Run a specific test suite +sbt "testOnly com.resonate.dos.GeoLocationDailyTest" +sbt "testOnly com.resonate.dos.ToBitmapTest" + +# Build assembly JAR +sbt assembly + +# Deploy — push to main triggers GitHub Actions deployment pipeline +``` + +--- + +## Key Concepts + +### district_source Provenance (CDP-118946 / CDP-118947) + +Every RCID in geo-location output carries a `district_source` column set by `DistrictResolver`: + +| Value | Meaning | +|---|---| +| `L2_CONFIRMED` | L2 has a district AND `l2_party_confirmed = true`. **Only this value sets the L2-confirmed marker bit in ToBitmap.** | +| `L2_UNCONFIRMED` | L2 has a district AND `l2_party_confirmed != true`. | +| `IP_INFERRED` | No L2 district; NetAcuity IP → zip → 4-CSV-mapping found a district. | +| `null` | No district data from any source. | + +**Observed distribution (dev 2026-05-12):** IP_INFERRED ~86.8%, L2_CONFIRMED ~9.9%, L2_UNCONFIRMED ~2.9%, null ~0.4%. + +**ToBitmap gating (CDP-118947):** The L2-confirmed marker bit is set **only** when `district_source = "L2_CONFIRMED"`. Before this fix (CDP-118947) the bit was set unconditionally (broken for L2_UNCONFIRMED and IP_INFERRED rows). + +### Zip→District Mappings (4-file Shape, CDP-118946) + +The district lookup uses 4 separate CSV files (one per district type) rather than a single consolidated CSV. Schema: `zip, state, `. Files version together as a snapshot. + +Passed to Spark jobs as a single `zipDistrictMappingsBasePath` argument pointing at the directory. `DistrictResolver` reads all 4 files by convention from that path. + +Deployed to S3 via terragrunt `configurations` entries in the geo-location pipeline. After updating CSVs, deploy via the geo-location GitHub Actions workflow. + +**IP-inferred fallback flow:** `GeoLocationDaily` calls NetAcuity to resolve the RCID's IP to a zip, then joins the 4 CSVs for the district. State-leg mappings are many-to-many and collapsed to 1:1 using rank-based dedup with `sort_array` for deterministic tiebreaking. + +### GeoLocationFullBackfill Rewrite (CDP-118946) + +Rewritten as **daily-on-multi-day-pixels**: applies daily geo enrichment logic to each date in the backfill range rather than re-reading the prior rolling-full. Key changes: + +- No longer takes `DistrictColumnList` arg — always re-derives all 4 districts (prevents stale labels) +- Reads existing `zip` from `fullDf`; applies same IP-fallback as Daily (no second NetAcuity call) +- `GeoLocationFull.expectedCols` extended with `zip` and `district_source` +- Orchestrator passes the date range as a rendered glob to backfill + +### Backfill `Should Run Full` Bug Fix (CDP-118512) + +A bug in the geo-location state machine (`geo_location.asl.json` in `step-function-workflow-orchestrator`) caused a pre-existing geo full for today to override backfill output. Fixed by adding an `IsPresent` check on `$.FullInputPath` before the path-equality choice. If backfill output is not propagating to the full-build stage, check the `Should Run Full` choice state. + +--- + +## Project-Specific Rules and Gotchas + +- **Spark 2.4.3**: This repo is still on Spark 2. Do not apply Spark 3-specific optimizations (EMR 7 migration for this repo is not yet underway). +- **State-leg mappings are many-to-many**: Do not assume 1:1 zip→district. `DistrictResolver` handles collapsing with `sort_array` for deterministic hash-based tiebreaking. +- **CSV versioning**: The 4 zip-district CSVs version together as a snapshot. Updating one requires coordinated update of all 4 if the underlying data changes. +- **`GeoLocationFull.expectedCols`**: If you add a new column to `GeoLocationDaily`, also add it to `expectedCols` in `GeoLocationFull.scala` so it flows through rolling-full and backfill. +- **NetAcuity in Backfill**: `GeoLocationFullBackfill` does NOT call NetAcuity. It reads the `zip` column already resolved by `GeoLocationDaily` and applies the CSV-based district lookup. diff --git a/claude-md-proposals/identity-graph.CLAUDE.md b/claude-md-proposals/identity-graph.CLAUDE.md new file mode 100644 index 0000000..db17627 --- /dev/null +++ b/claude-md-proposals/identity-graph.CLAUDE.md @@ -0,0 +1,181 @@ +# CLAUDE.md — resonate/identity-graph +# This is a NEW file to be created at the root of the identity-graph repo + +# identity-graph + +## Project Purpose and Architecture Overview + +This repo contains the **Identity Graph** — Resonate's Person Resolution and Identity System (PRISM). It has two layers: + +1. **Scala/Spark Jobs** (`src/main/scala/`) — 11 batch ETL jobs that build the underlying identity tables (identifier index, persons) from raw sources (L2, Experian, TapAd, LiveIntent, etc.) +2. **prism_dbt** (`dbt/prism_dbt/`) — PRISM's consumer-facing dbt package v1.0. Exposes four primitives (`waterfall_match`, `identifier_expand`, `persons_project`, `lookup`) as Lambda-invokable Snowflake service models. + +**Production JAR path:** `s3://resonate-core-applications/identity-graph/jars/identity-graph-latest.jar` + +**Design docs (published to S3):** +- `https://s3.nonprod.aws.resonatedigital.net/person-identity-graph/index.html` — hub +- `docs/design/design_and_production_readiness.html` — main architecture doc +- `docs/design/consumer_interface_design.html` — consumer integration (Append/BlockGraph/GeoFix/Cortex) +- `docs/design/project_plan.html` — 6 tracks, Jira ticket DAG + +--- + +## Repository Structure + +``` +src/main/scala/com/resonate/spark/apps/ + ExperianPreprocessJob.scala # Normalizes Experian ConsumerView + digital graph + L2ConsumerPreprocessJob.scala # Preprocesses L2 consumer tab-delimited files + TapadIndexJob.scala # Experian/TapAd → identifier_index + L2IndexJob.scala # L2 voter → identifier_index + L2ConsumerIndexJob.scala # L2 consumer → identifier_index + OnboardingMatchJob.scala # 4-phase matching (blocking, corroboration, scoring, fallback) + OnboardingHouseholdLinksJob.scala # Household CTV/TTD attribution via digital graph + OnboardingPersonsJob.scala # TapAd-only person records (deprecated: BL-106) + MergeOnboardingLinksJob.scala # Multi-provider collapse + confidence scoring + PersonIdentityJob.scala # Main orchestrator: persons + identifier_index + CustomerMatchJob.scala # Ephemeral customer file matching (7 fallback tiers) + +src/main/scala/com/resonate/spark/utils/ + HashUtils.scala # Deterministic SHA-256 person_id + household_id + StagingWriter.scala # Cache/count/write/unpersist pattern + AddressNormalizer.scala # USPS Pub 28 normalization + nickname resolution (max 5 hops) + IpFilter.scala # RFC1918 + cloud IP blocklist + ScoringConfig.scala # Centralized confidence scoring + +dbt/prism_dbt/ # Consumer-facing dbt package (see dbt/prism_dbt/README.md) + macros/ # waterfall_match, identifier_expand, persons_project, lookup + models/service/ # Lambda-invokable: waterfall.sql, identifier_expand.sql, persons_project.sql + snowflake/ + function/name_address_hash/ # NAME_ADDRESS_HASH Snowflake UDF + stored_procedure/prism_lookup/ # PRISM_LOOKUP SP + stored_procedure/name_address_lookup/ # NAME_ADDRESS_LOOKUP SP + +docs/design/ # HTML design artifacts (published to S3) +docs/scripts/ # build_columns_page.py, sample_parquet.py + +.github/workflows/ + snowflake_dbt.yml # Deploy prism_dbt via snow dbt deploy + snowflake_stored_procedure.yml # Deploy PRISM_LOOKUP / NAME_ADDRESS_LOOKUP + snowflake_function.yml # Deploy NAME_ADDRESS_HASH UDF +``` + +--- + +## Key Commands + +### Scala/Spark Jobs (SBT) + +```bash +# Run all unit tests (443 tests across 20 suites) +sbt test + +# Run a specific test suite +sbt "testOnly com.resonate.spark.apps.PersonIdentityJobSpec" +sbt "testOnly com.resonate.spark.apps.CustomerMatchJobSpec" + +# Build assembly fat JAR for EMR deployment +sbt assembly + +# Upload to S3 (prod) +aws s3 cp target/identity-graph-latest.jar \ + s3://resonate-core-applications/identity-graph/jars/identity-graph-latest.jar +``` + +See the [deployment wiki](https://resonate-jira.atlassian.net/wiki/spaces/ASD/pages/3683057840/Pipeline+Scala+SBT+GitOps) for the GitOps deploy process. + +### prism_dbt (Snowflake Native dbt Package) + +```bash +cd dbt/prism_dbt + +# Install dbt dependencies +dbt deps + +# Local dev (requires ~/.dbt/profiles.yml — copy from profiles.yml.example) +dbt seed --profiles-dir +dbt run --select waterfall --profiles-dir +dbt test --profiles-dir + +# Offline unit tests only (no Snowflake needed — fast) +dbt test --select test_type:unit # 16 tests + +# All tests (includes live Snowflake singular tests against resonate-dev) +dbt test # 25 tests total +``` + +### Snowflake Deployments (GitHub Actions) + +All Snowflake deployments go through GitHub Actions. Do not run `snow dbt deploy` manually from the CLI. + +| Workflow | What it deploys | +|---|---| +| `snowflake_dbt.yml` | `prism_dbt` package via `snow dbt deploy` (runs `dbt parse` as fast-fail gate) | +| `snowflake_stored_procedure.yml` | `PRISM_LOOKUP` or `NAME_ADDRESS_LOOKUP` stored procedures | +| `snowflake_function.yml` | `NAME_ADDRESS_HASH` UDF | + +--- + +## PRISM dbt Package (prism_dbt v1.0) + +### The Four Primitives + +| Macro | Direction | Purpose | +|---|---|---| +| `waterfall_match` | identifiers → 1 person_id per row | Resolve mixed-identifier customer rows to a canonical `person_id`. First-match-wins by priority. | +| `identifier_expand` | person_id → identifiers of ONE type | Fan out person_ids to deliverable identifiers. Call once per identifier type. | +| `persons_project` | person_id → persons attributes | Project selected `persons` columns (name, address, lalvoterid, …) onto person_ids. | +| `lookup` | one identifier → one person + linked IDs | Single-row debug. Also deployed as `PRISM_LOOKUP` stored procedure. | + +### v1.0 Data Sources +- `PERSONS_LATEST` + `IDENTIFIER_INDEX_LATEST` (filtered to `identifier_rank = 1`) +- `CROSSWALK_LATEST` exists but is **NOT used** in v1.0 (missing metadata for confidence filtering) +- `ZIP11` routes automatically to `persons.zip11` (not `identifier_index`) via `_internal/lookup_strategy.sql` + +### Service Model Invocation + +```sql +EXECUTE DBT PROJECT RESONATE.PRISM.prism_dbt + ARGS = $$run --select waterfall --vars '{ + "input_relation": "MY_DB.MY_SCHEMA.my_input", + "consumer": "append", + "job_run_id": "run_001", + "waterfall_order": [ + {"identifier_type": "HEM_MD5", "stitch_label": "hem"}, + {"identifier_type": "HEM_SHA256", "stitch_label": "hem"}, + {"identifier_type": "MAID", "stitch_label": "maid"} + ] + }'$$; +``` + +### NAME_ADDRESS_HASH UDF + +Normalizes raw PII to a SHA-256 hash for name+address waterfall steps: + +```sql +SELECT NAME_ADDRESS_HASH(first_name, last_name, address_line, zip) AS NAME_ADDRESS FROM my_input; +``` + +v1.0: `LOWER + TRIM + strip non-alphanumeric + LPAD zip + pipe-concat + SHA-256`. Does not yet do canonical-first-name resolution (Bill ≠ William — planned v1.1). + +--- + +## Key Concepts and Bug Fixes in v1.0 + +- **ZIP11 routing**: `waterfall_match` auto-routes `ZIP11` steps to `persons.zip11`. Just include `{"identifier_type": "ZIP11", "stitch_label": "zip11"}` in `waterfall_order`. +- **Fan-out tiebreak**: Always deterministic. Order: confidence DESC → source_count DESC → max_effective_weight DESC → person_id ASC. +- **`L2IndexJob` HOUSEHOLD_ID null guard**: `filterNulls = true` — prevents phantom household collapse. +- **`OnboardingMatchJob` Phase 3 tiebreak**: `ORDER BY person_id ASC` as final tiebreak — fixes non-determinism. +- **`ExperianPreprocessJob` aggregation**: `min(id_value)` not `first(id_value)` — deterministic. +- **Null guards**: `identifier_value` filtered before `groupBy` in `PersonIdentityJob` and `MergeOnboardingLinksJob`. + +--- + +## Project-Specific Rules + +- **All Snowflake deployments via GitHub Actions** — never `snow dbt deploy` from the CLI. +- **`snowflake_profiles/profiles.yml` is a deploy stub** (account/user = `'_'`). Do not put real credentials there. +- **dbt unit tests are offline** — run `--select test_type:unit` first for fast feedback. +- **Singular tests require `resonate-dev` Snowflake** — live mock tables in `RESONATE.PRISM.*` (provisioned from `.context/cdp-119018-mock-tables.sql`). +- **person_id ≠ RID**: PRISM produces `person_id`. `RID` (Resonate-internal) is owned by Append's downstream pipeline. `waterfall_match` output column is `prism_matched_person_id`. +- **SemVer**: v1.0 is the Append cutover baseline. Release tagging tracked in CDP-119015. diff --git a/claude-md-proposals/kshrivastava.CLAUDE.md b/claude-md-proposals/kshrivastava.CLAUDE.md new file mode 100644 index 0000000..f473d21 --- /dev/null +++ b/claude-md-proposals/kshrivastava.CLAUDE.md @@ -0,0 +1,117 @@ +# CLAUDE.md — resonate/kshrivastava +# This is a NEW file to be created at the root of the kshrivastava repo + +# kshrivastava — Multi-Agent Orchestration V3 + +Personal repository for Kapil Shrivastava (POCs and experiments). The primary artifact is the **V3 multi-agent orchestration framework** in `multi-agent-orchestration-v3/`. + +--- + +## V3 Multi-Agent Orchestration Framework + +### What It Is + +A Temporal-based orchestration system where AI agents collaborate to plan, code, review, and retrospect on software engineering stories. Each story runs as a Temporal workflow; agents implement domain-specific **roles** and consult domain-expert **skills** mid-run. + +``` +story ──▶ Temporal Workflow ──▶ Planner ──▶ Coder ──▶ Reviewer ──▶ Retro + │ │ + └──▶ Specialist (consult-by-default) + ↑ + append-agent / cleanroom / + audience-delivery / ... +``` + +### Repository Structure + +``` +multi-agent-orchestration-v3/ +├── framework/ +│ ├── role/ +│ │ ├── runner.py # Runs a role; injects available_specialists catalog +│ │ ├── loader.py # Loads role YAML + prompt +│ │ └── schemas.py # CoderOutput, PlannerOutput, SpecialistOutput, etc. +│ ├── recipe/ +│ │ ├── interpreter.py # Runs recipe phases; Contract-4 consult hook +│ │ └── specialist_consult.py # run_specialist_consult() +│ └── runtime/ +│ └── cassettes.py # Request/response cassette recording for replay +├── roles/ +│ ├── planner.yaml + prompts/planner.md +│ ├── coder.yaml + prompts/coder.md +│ ├── reviewer.yaml + prompts/reviewer.md # claude-sonnet-4 (upgraded from Haiku) +│ ├── retro.yaml + prompts/retro.md +│ └── specialist.yaml + prompts/specialist.md # Generic; self-selects skill by description +├── skills/ +│ ├── WIRING.md # Specialist wiring docs +│ ├── append-agent/ +│ │ ├── SKILL.md # Grounding + playbooks for Append domain +│ │ └── references/ # Source map: Confluence/Jira/GitHub/AWS/Splunk +│ ├── cleanroom/ # Placeholder (routable, content pending) +│ └── audience-delivery/ # Placeholder (routable, content pending) +├── recipes/ +│ └── default.yaml # Phase sequence: Plan → Code → Review → Retro +├── tests/unit/ # 826+ unit tests +└── docs/ + ├── ENGINEER_ONBOARDING.md + └── presentations/ + └── special-agent-explainer.html +``` + +### Key Concepts + +**Roles** — YAML-defined agents with a `role: ` field and matching `prompts/.md`. Models: +- Planner: `claude-opus-4` (quality-sensitive planning) +- Coder: `claude-sonnet-4` +- Reviewer: `claude-sonnet-4` (upgraded from Haiku — review quality is critical) +- Retro: `claude-haiku-4` +- Specialist: `claude-sonnet-4` + +**Skills** — Domain knowledge bundles in `skills//`. Each needs a `SKILL.md` with a sharp `description` (the routing key) and grounding workflow. Auto-discovered by `discover_skills()` — no code wiring needed to add a domain. + +**Specialist / Consult System (Contract 4 — LIVE):** +1. Any role emits `consultations_needed: [{specialist, query}]` in its output +2. `framework/recipe/interpreter.py` (`_extract_consultations`) calls `run_specialist_consult()` +3. The generic `specialist` role self-selects the skill by description matching +4. `SpecialistOutput {answer, citations, confidence}` is threaded into caller's `prior_attempt_summaries`; caller re-runs with answer in context +5. Per-phase round cap mirrors the escalation cap + +**Consult-by-default:** +- `available_specialists` catalog (`{name, description}`) injected into EVERY role's `task_input` by `runner.py` +- Planner/coder prompts: scan catalog first; consult before guessing or escalating a domain fact +- Catalog excluded from cassette fingerprint — recordings replay regardless of deployed specialists + +**Ticket-based story lookup:** +- `get_state` / `get_history` accept a bare Jira ticket (e.g. `CDP-117874`) resolving to workflow +- Writes still need explicit `workflow_id` + +### Key Commands + +```bash +cd multi-agent-orchestration-v3 + +# Install dependencies +uv sync + +# Run unit tests +pytest tests/unit/ # ~826 tests +pytest tests/unit/test_.py # specific file + +# Start the dev worker (requires AWS SSO + Temporal cluster) +aws sso login +python framework/worker.py + +# Story CLI +python cli.py start --story "CDP-12345" --description "Fix the frobnicator" +python cli.py get-state --story CDP-12345 +python cli.py get-history --story CDP-12345 +``` + +### Rules and Gotchas + +- **This is a POC repo.** Production V3 will live elsewhere once the framework is proven. +- **Dev worker restart required** after merging branches. Reviewer→Sonnet model change takes effect only after worker restart. +- **Adding a skill**: Drop `skills//SKILL.md` with a sharp `description`. No code changes — `discover_skills()` auto-discovers it. The `description` is the routing key. +- **`workspace_isolation: per_attempt_worktree`**: Specialist uses per-attempt worktrees. Read-only (Q&A) invocations simply don't write. +- **Consult round cap**: After `MAX_CONSULTATION_ROUNDS_PER_PHASE` consults per phase the agent proceeds with what it has. Audit trail in `ctx.keys["_consultation_log"]`. +- **Cassette fingerprint**: `available_specialists` excluded — replay works regardless of which domains are deployed at playback time. diff --git a/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md b/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md new file mode 100644 index 0000000..0d6f526 --- /dev/null +++ b/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md @@ -0,0 +1,51 @@ +# CLAUDE.md update — resonate/step-function-workflow-orchestrator +# Append the following two sections to the END of the existing CLAUDE.md + +--- + +## Decommissioned Pipelines (Removed from Deployment) + +These pipelines have had their Terraform infrastructure removed and **must not be redeployed**. +Pipeline code and config files are retained for historical reference only. + +| Pipeline | Reason | PR | +|---|---|---| +| `fusion-behavior-preprocess` | Dormant since 2023; inadvertently re-deployed 2026-05-28. Sovrn/Havas no longer a client. | #734 | +| `cookiejar-sample-export` | Havas no longer a client; `CookieJarSampler` Scala app marked `@deprecated` in `core-data-pipelines-spark`. | #748 | + +Both pipelines still have AWS resources (Step Functions, EventBridge rules, IAM roles) requiring manual `terragrunt destroy` or console cleanup in both dev and prod accounts. See `DEPRECATED.md` inside each pipeline directory. + +--- + +## EMR 7.12.0 / Spark 3 Migration (CDP-118269) + +These pipelines have been migrated from `emr-5.34.0` + Spark 2 to `emr-7.12.0` + Spark 3: + +| Pipeline | PR | Notes | +|---|---|---| +| `sovrn-weekly` | #729 | | +| `maid-onboarder` | #730 | Prod-volume validated end-to-end | +| `damlam-preparation` | #732 | | +| `idsync-overlap` | #733 | YARN vcore fix: `yarn.nodemanager.resource.cpu-vcores = 16` on integration | +| `behavior-stitch` | #737 | YARN vcore fix applied; use `--master yarn` for Spark 3 | +| `marketops-overlap` | #731 | | +| `topic-aggregation` | #736 | Requires `--master yarn` flag in Spark 3 spark-submit args | + +**Jar convention:** Spark 3 pipelines use `core-data-pipeline-spark3-latest.jar`; legacy pipelines use `core-data-pipeline-latest.jar` or `assembly-*.jar`. When migrating a new pipeline update both `params.json` (prod + integration) and `emr.json` (prod + integration + on-demand variants). + +**YARN memory on EMR 7 (r5.xlarge):** EMR 7 overhead factor ~18.8% (higher than EMR 5). Reduce integration executor memory (e.g. `21g → 16g`). Prod r5d.4xlarge can keep 21G. + +**Dev jar path:** Dev environments must reference the `-dev` S3 bucket (e.g. `resonate-core-datasets-dev/...`), not the prod bucket. + +--- + +## Geo-Location Pipeline Recent Changes + +### TapAd-Derived HEM→RCID Stitch (CDP-118512) +`l2-processing` pipeline now ingests TapAd Digital Graph + LiveRamp Tradedesk Agg to supplement LiveIntent-based HEM→RCID derivation (~+21% coverage lift on district-attributed RCIDs). The `dynamic-dates` Lambda resolves TapAd's flat-file dates (no date-named subdirectory) via a new flat-file sentinel mode. + +### Zip→District Mappings (CDP-118946) +Four new CSVs under `pipelines/geo-location/config/zip-district-mappings/` (one per district type: congress, proposed-congress, state-senate, state-house). Schema: `zip, state, `. Deployed to S3 via terragrunt `configurations` entries. Passed as `myZipDistrictMappingsBasePath` to both `GeoLocationDaily` and `GeoLocationFullBackfill`. + +### Backfill `Should Run Full` Bug Fix (CDP-118512) +A bug in `geo_location.asl.json` where a pre-existing geo full for today overrode the backfill output was fixed by adding an `IsPresent` check on `$.FullInputPath` before the path-equality choice. If backfill output is not propagating, check the `Should Run Full` state machine choice state. From 90fc01ed4696c79ec19954239f0b6b47e8bda784 Mon Sep 17 00:00:00 2001 From: Claude Date: Fri, 5 Jun 2026 13:21:50 +0000 Subject: [PATCH 2/2] fix: address Copilot review comments on step-function-workflow-orchestrator proposal MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Correct header: "two sections" → "three sections" (Decommissioned Pipelines, EMR 7.12.0/Spark 3 Migration, and Geo-Location Pipeline Recent Changes) - Fix arg name: myZipDistrictMappingsBasePath → zipDistrictMappingsBasePath (consistent with dos-data-pipeline.CLAUDE.md and PR description) https://claude.ai/code/session_014wxhfqnUnnc6kcDYEgWDx3 --- .../step-function-workflow-orchestrator.CLAUDE.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md b/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md index 0d6f526..9484883 100644 --- a/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md +++ b/claude-md-proposals/step-function-workflow-orchestrator.CLAUDE.md @@ -1,5 +1,5 @@ # CLAUDE.md update — resonate/step-function-workflow-orchestrator -# Append the following two sections to the END of the existing CLAUDE.md +# Append the following three sections to the END of the existing CLAUDE.md --- @@ -45,7 +45,7 @@ These pipelines have been migrated from `emr-5.34.0` + Spark 2 to `emr-7.12.0` + `l2-processing` pipeline now ingests TapAd Digital Graph + LiveRamp Tradedesk Agg to supplement LiveIntent-based HEM→RCID derivation (~+21% coverage lift on district-attributed RCIDs). The `dynamic-dates` Lambda resolves TapAd's flat-file dates (no date-named subdirectory) via a new flat-file sentinel mode. ### Zip→District Mappings (CDP-118946) -Four new CSVs under `pipelines/geo-location/config/zip-district-mappings/` (one per district type: congress, proposed-congress, state-senate, state-house). Schema: `zip, state, `. Deployed to S3 via terragrunt `configurations` entries. Passed as `myZipDistrictMappingsBasePath` to both `GeoLocationDaily` and `GeoLocationFullBackfill`. +Four new CSVs under `pipelines/geo-location/config/zip-district-mappings/` (one per district type: congress, proposed-congress, state-senate, state-house). Schema: `zip, state, `. Deployed to S3 via terragrunt `configurations` entries. Passed as `zipDistrictMappingsBasePath` to both `GeoLocationDaily` and `GeoLocationFullBackfill`. ### Backfill `Should Run Full` Bug Fix (CDP-118512) A bug in `geo_location.asl.json` where a pre-existing geo full for today overrode the backfill output was fixed by adding an `IsPresent` check on `$.FullInputPath` before the path-equality choice. If backfill output is not propagating, check the `Should Run Full` state machine choice state.