Skip to content

chore: CLAUDE.md proposals for 5 repos with recent merged PRs (May–Jun 2026)#6

Open
shrivastavakapil2000 wants to merge 2 commits into
mainfrom
claude/serene-davinci-YuIrA
Open

chore: CLAUDE.md proposals for 5 repos with recent merged PRs (May–Jun 2026)#6
shrivastavakapil2000 wants to merge 2 commits into
mainfrom
claude/serene-davinci-YuIrA

Conversation

@shrivastavakapil2000

Copy link
Copy Markdown

Summary

Reviews merged PRs from May–June 2026 by team members SayaliPat, shrivastavakapil2000, JoeVsVolcano, mike-brant, and nathan-resonate across the organization, and produces CLAUDE.md files capturing current architecture, key concepts, and recent changes.

Since this session only has write access to resonate/.github, the files are staged here under claude-md-updates/. Each needs to be copied to the root of its target repo and a separate PR opened there.

Files to Apply

File in this PR Target Repo Action
claude-md-updates/step-function-workflow-orchestrator/CLAUDE.md resonate/step-function-workflow-orchestrator Create
claude-md-updates/batch-expression-modeling/CLAUDE.md resonate/batch-expression-modeling Replace existing
claude-md-updates/identity-graph/CLAUDE.md resonate/identity-graph Create
claude-md-updates/batch-audience-delivery-syndication/CLAUDE.md resonate/batch-audience-delivery-syndication Create
claude-md-updates/dos-data-pipeline/CLAUDE.md resonate/dos-data-pipeline Create

What's Documented in Each File

step-function-workflow-orchestrator (NEW)

  • 12 active pipeline inventory with purposes
  • EMR 5→7 migration notes (Spark 2→3, YARN vcore fix, memory tuning)
  • Integration test patterns (pytest, synthetic golden data, 90-min timeout)
  • Dynamic dates lambda: directory mode vs flat-file sentinel mode
  • Environment table (dev/qa/integration/nonprod/prod)
  • Recent: fusion-behavior-preprocess removed, geo district namespace fixes, L2 TapAd stitch wiring

batch-expression-modeling (UPDATE)

  • All existing content preserved
  • Added: formatter metrics lambda (batch-delivery-formatter-publish-metrics)
  • Added: stitch throttle fix — MaxConcurrency=2 + index-staggered Wait (CDP-118972)
  • Added: delta_with_full_fallback refresh type requirement in _annotate_previous_date_partitions
  • Added: Formatter output path partition order (vendor=*/method=*/)

identity-graph (NEW)

  • 11 Spark pipeline jobs with purposes and bug-fix history
  • Shared utilities (HashUtils, StagingWriter, AddressNormalizer, IpFilter, ScoringConfig)
  • PRISM design overview (6 tracks, 20 Jira tickets)
  • All jobs use scopt CLI args (no application.conf)
  • Scala 2.12.17 / Spark 3.5.0, JDK 17 --add-opens flags

batch-audience-delivery-syndication (NEW)

  • Vendor table (OpenX, Experian, Viant, BlockGraph)
  • openx-publish-data-files: hardcoded .csv.gz extension — do not revert (CDP-118955 fix)
  • blockgraph-create-taxonomy-file: taxonomy generation, SPI=N constant, initial vs. refresh routing
  • Source path partition order: vendor=*/method=av/ (not legacy method=av/vendor=*/)

dos-data-pipeline (NEW)

  • district_source provenance enum (L2_CONFIRMED, L2_UNCONFIRMED, IP_INFERRED)
  • IP-inferred district fallback via 4 ZIP→district CSVs
  • ToBitmap gating on L2_CONFIRMED
  • ZIP→district namespace requirements (L2 canonical vs floterial — critical for aggregation joins)
  • GeoLocationFullBackfill always re-derives all 4 districts

How to Apply to Target Repos

# For each target repo:
git checkout -b chore/add-claude-md
cp claude-md-updates/<repo-name>/CLAUDE.md ./CLAUDE.md
git add CLAUDE.md && git commit -m "chore: add CLAUDE.md with project guidance for Claude Code"
git push -u origin chore/add-claude-md
gh pr create --title "chore: add CLAUDE.md" --body "Adds Claude Code project guidance based on recent merged PRs."

Generated by Claude Code

Reviews merged PRs (May–June 2026) from SayaliPat, shrivastavakapil2000,
JoeVsVolcano, mike-brant, and nathan-resonate across the organization and
produces CLAUDE.md files capturing current architecture, key concepts, and
recent changes for:

- step-function-workflow-orchestrator (new) — EMR 7 migration, 12 pipelines,
  dynamic dates, integration test patterns
- batch-expression-modeling (update) — formatter metrics lambda, stitch
  throttle fix, delta_with_full_fallback fix, new formatter path layout
- identity-graph (new) — PRISM design, 11 Spark jobs, shared utilities
- batch-audience-delivery-syndication (new) — OpenX/BlockGraph lambdas,
  csv.gz extension fix, source path partition order
- dos-data-pipeline (new) — district_source provenance, IP-inferred fallback,
  zip→district CSV namespace requirements

https://claude.ai/code/session_0121mjajLMWrWCnCUnjvKWZH
Copilot AI review requested due to automatic review settings June 3, 2026 13:11

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR stages CLAUDE.md documentation updates (generated from recent May–June 2026 merged PR activity) for five other repositories under claude-md-updates/, along with a local README describing how to copy/apply them in their target repos.

Changes:

  • Added new CLAUDE.md drafts for: step-function-workflow-orchestrator, identity-graph, batch-audience-delivery-syndication, and dos-data-pipeline.
  • Replaced/updated the CLAUDE.md draft for batch-expression-modeling.
  • Added claude-md-updates/README.md with copy/apply instructions.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
claude-md-updates/README.md Documents which staged CLAUDE.md maps to which target repo and how to copy/apply them.
claude-md-updates/step-function-workflow-orchestrator/CLAUDE.md Draft guidance for Step Functions + EMR pipeline orchestrator repo (pipelines, EMR7 migration notes, testing patterns).
claude-md-updates/batch-expression-modeling/CLAUDE.md Draft guidance for BEM system (architecture, workflows, JSONata notes, recent operational fixes).
claude-md-updates/identity-graph/CLAUDE.md Draft guidance for Identity Graph / PRISM Spark jobs and build/deploy/test workflows.
claude-md-updates/batch-audience-delivery-syndication/CLAUDE.md Draft guidance for syndication workflows/lambdas and vendor-specific gotchas.
claude-md-updates/dos-data-pipeline/CLAUDE.md Draft guidance for DOS geo/district pipeline concepts and recent changes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread claude-md-updates/README.md Outdated
@@ -0,0 +1,65 @@
# CLAUDE.md Updates

This directory contains proposed CLAUDE.md files for 5 repositories that had active PRs merged into main (May–June 2026) by team members: SayaliPat, shrivastavakapil2000, JoeVsVolcano, PallaviJagarlamudi, mike-brant, nathan-resonate.
Comment thread claude-md-updates/README.md Outdated
# Example for step-function-workflow-orchestrator
cd /path/to/step-function-workflow-orchestrator
git checkout -b chore/add-claude-md
cp /path/to/this/step-function-workflow-orchestrator/CLAUDE.md ./CLAUDE.md
- Remove PallaviJagarlamudi from author list (no merged PRs found in scope)
- Fix copy command to reference actual path in this repo
  (claude-md-updates/<repo>/CLAUDE.md) with raw GitHub URL alternative

https://claude.ai/code/session_0121mjajLMWrWCnCUnjvKWZH
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants