Skip to content

Implement RRMS/Molnix joins for the 4-table ingest model (Follow Up to #2633) #2732

@arunissun

Description

@arunissun

Continuation of ticket #2633
@szabozoltan69 @tovari

Implement RRMS/Molnix joins for the 4-table ingest model

Summary

Implement the join model for the four RRMS/Molnix ingest tables and the optional bridge to the existing GO database deployments_personnel table.

This ticket builds on the earlier 4-table ingest proposal and adds the missing join contract:

  1. what data should be fetched from each Molnix endpoint,
  2. which fields should be stored in each table,
  3. which keys should be used to join the four tables,
  4. how the four-table model joins to the existing database,
  5. which joins are core and which fields are optional enrichment.

The key correction is that deployments_molnixappraisal.target_id is not a person key. The appraisal-to-person join requires a new Molnix-sourced field:

deployments_molnixappraisal.appraised_person_id = original.data[].appraisal.deployment.person_id

Background

The earlier proposal defined four tables:

  1. rrms_person_snapshot
  2. deployments_molnixappraisal
  3. deployments_molnixappraiser
  4. rrms_event_participation

The follow-up join analysis verified that the 4-table shape is still correct, but the initial schema needs one important fix: deployments_molnixappraisal must store the appraised person's RRMS person id as appraised_person_id.

The analysis also verified that the local GO database table deployments_personnel can be joined to RRMS people through deployments_personnel.molnix_id, but only as a partial analytical bridge. It should not be enforced directly as a hard foreign key because some historical molnix_id values no longer resolve in live RRMS.

Field Consistency Check Against Earlier Proposal

The target fields below were checked against latest_analysis/molnix_rrms_comprehensive_data_requirements_and_schema_proposal.md.

Table Earlier proposal field count This ticket field count Difference
rrms_person_snapshot 12 12 No target-field difference
deployments_molnixappraisal 18 19 Adds appraised_person_id from the later join analysis
deployments_molnixappraiser 12 12 No target-field difference
rrms_event_participation 16 16 No target-field difference

The field tables below keep the original target field names. The only target-field addition beyond the original proposal is deployments_molnixappraisal.appraised_person_id, which is required for the corrected appraisal-to-person join.

There is one source-mapping clarification for appraisers: the earlier proposal listed top-level /api/api/appraisals appraiser fields, but the later live join analysis found that the complete appraiser set is nested under appraisal.appraisers[]. This ticket keeps the same deployments_molnixappraiser target columns as the earlier proposal, while sourcing the full child set from sanitized nested appraiser rows and using top-level self-row fields only for optional enrichment.

Scope

In scope

  • Implement or update the four RRMS/Molnix ingest tables.
  • Add deployments_molnixappraisal.appraised_person_id.
  • Load person, appraisal, appraiser, and event participation data from the Molnix API endpoints listed below.
  • Add indexes and join keys needed for downstream reporting.
  • Add a safe nullable bridge from deployments_personnel to rrms_person_snapshot if database-level FK integrity is required.
  • Validate joins after ingest.

Out of scope

  • Do not collapse the four tables into one flat table.
  • Do not merge deployments_molnixappraisal and deployments_molnixappraiser.
  • Do not store appraisal PII fields such as names, emails, tokens, or free-text responses.
  • Do not treat target_id as a person id.
  • Do not add a hard FK directly from deployments_personnel.molnix_id to rrms_person_snapshot.person_id.

Target Data Sources By Table

1. rrms_person_snapshot

One row represents one RRMS/Molnix person snapshot.

Primary key:

person_id

Endpoints:

Endpoint Use
People cohort endpoint selected for RRMS and standby extraction, for example /api/api/people/... Initial person id discovery once exact cohort filters are confirmed
/api/api/people/{id} Person-level non-PII attributes
/api/api/people/{id}/tags Person tags used for RRMS member/standby classification and sector/status markers
/api/api/people/{id}/roles Person roles
/api/api/people/{id}/languages Person languages
/api/api/system/organizations Organization id-to-name enrichment
/api/api/system/organizations/{orgid} Organization detail enrichment where needed

Fields to store:

Target field Source field or rule Notes
person_id data.id RRMS person id
person_status Derived from cohort selection and /api/api/people/{id}/tags data[] Keep the original field name; status is derived rather than directly stored as one source field
sex data.sex Non-PII reporting dimension
current_availability data.current_availability Where present
outofscope data.outofscope Boolean
organization_id data.organization Partially confirmed; keep nullable
organization_name /api/api/system/organizations and /api/api/system/organizations/{orgid} Organization name enrichment as described in the proposal
roles_json /api/api/people/{id}/roles data[] Proposal-aligned person roles payload
languages_json /api/api/people/{id}/languages data[] Proposal-aligned person languages payload
tags_json /api/api/people/{id}/tags data[] Proposal-aligned person tags payload
personnel_id Local GO DB key resolved during ingest Optional local bridge, as already described in the proposal
source_updated_at People endpoint update timestamp, where present Source freshness marker

Population rule:

  • Build or refresh person snapshots for all RRMS cohort people in scope.
  • Also refresh any person ids referenced by appraisals and event participation in the same run so joins do not fail because the cached person range is stale.

2. deployments_molnixappraisal

One row represents one Molnix appraisal.

Primary keys and keys:

id                  local surrogate PK
molnix_id           unique source appraisal id

Endpoints:

Endpoint Use
/api/api/appraisals?page=n Core appraisal fields, deployment context, competencies, nested appraisers
/api/api/deployments/{deployment_id} Optional deployment detail enrichment for sending/receiving organization
/api/api/system/organizations Organization id-to-name enrichment
/api/api/system/organizations/{orgid} Organization detail enrichment where needed

Fields to store:

Target field Source field or rule Notes
id Local generated id Surrogate PK
molnix_id original.data[].appraisal.id Unique source appraisal id
target_id original.data[].appraisal.target_id Keep as raw source field; not a person key
deployment_molnix_id original.data[].appraisal.deployment.id Molnix deployment id, not GO local deployment PK
appraised_person_id original.data[].appraisal.deployment.person_id Required person join field from the later join analysis
stage original.data[].appraisal.stage Appraisal status/stage
appraisers_count original.data[].appraisal.appraisers_count Validate against nested appraiser count
score original.data[].appraisal.score Numeric score
deployment_country_id original.data[].appraisal.deployment.country_id Deployment context
deployment_start original.data[].appraisal.deployment.start Deployment context
deployment_end original.data[].appraisal.deployment.end Deployment context
deployment_title original.data[].appraisal.deployment.title Deployment context
sending_organization_id /api/api/deployments/{deployment_id} sending organization id Proposal-aligned deployment detail enrichment
receiving_organization_id /api/api/deployments/{deployment_id} receiving organization id Proposal-aligned deployment detail enrichment
deployment_tags_json original.data[].appraisal.deployment.tags[] Store sanitized JSON
competencies_json original.data[].appraisal.competencies[] Store sanitized JSON
personnel_id Local GO DB key resolved during ingest Optional local bridge, as already described in the proposal
created_at original.data[].appraisal.created_at Source timestamp
updated_at original.data[].appraisal.updated_at Source timestamp

Reference verification from the join audit:

  • Appraisal rows observed: 362
  • Duplicate appraisal ids: 0
  • Null appraised_person_id rows: 0
  • Orphan appraised person rows after same-run people refresh: 0
  • target_id vs deployment.id mismatches: 0

3. deployments_molnixappraiser

One row represents one appraiser/reviewer assignment attached to an appraisal.

Primary keys and keys:

id                   local surrogate PK
molnix_id            unique source appraiser assignment id
appraisal_molnix_id  parent source appraisal id

Endpoint:

Endpoint Use
/api/api/appraisals?page=n Proposal-aligned source endpoint; later live analysis showed the full appraiser set must be built from sanitized nested appraisal.appraisers[]

Fields to store:

Target field Source field or rule Notes
id Local generated id Surrogate PK
molnix_id original.data[].id Proposal target retained; later live analysis says populate the full set from nested sanitized appraiser rows
appraisal_molnix_id original.data[].appraisal_id Proposal target retained; later live analysis says populate from nested appraiser linkage
appraisal_id Local resolved deployments_molnixappraisal.id Optional local join id
appraiser_type original.data[].appraiser_type Proposal target retained; later live analysis says populate the full set from nested sanitized appraiser rows
person_id original.data[].person_id Proposal target retained; later live analysis showed this is only available for the visible top-level self row, so keep nullable
personnel_id Local GO DB key resolved during ingest Optional local bridge, as already described in the proposal
required original.data[].required Proposal target retained; later live analysis says populate the full set from nested sanitized appraiser rows
completed_at original.data[].completed_at Proposal target retained; later live analysis says populate the full set from nested sanitized appraiser rows
notified_at original.data[].notified_at Proposal target retained; later live analysis showed this is only available on the visible top-level self row
created_at original.data[].created_at Proposal target retained; later live analysis showed this is only available on the visible top-level self row
updated_at original.data[].updated_at Proposal target retained; later live analysis showed this is only available on the visible top-level self row

Privacy exclusions:

Do not persist these fields from appraiser payloads:

name
email
token
token_expires_at
responses
free-text response content

Reference verification from the join audit:

  • Nested appraiser rows observed: 796
  • Duplicate nested appraiser ids: 0
  • Parent appraisal mismatches: 0
  • appraisers_count mismatches: 0
  • Nested appraiser rows do not expose person_id for the full appraiser set.

Decision:

  • Keep appraisers as a separate child table.
  • Do not merge appraisers into deployments_molnixappraisal.
  • Do not model appraiser-to-person as a required core relationship.

4. rrms_event_participation

One row represents one RRMS event-person participation record.

Primary keys and keys:

id        local surrogate PK
event_id  Molnix/RRMS event id
person_id RRMS person id

Endpoint:

Endpoint Use
/api/api/events?page=n Event fields, participant rows, and participant pivot metadata

Fields to store:

Target field Source field or rule Notes
id Local generated id Surrogate PK
event_id events.data[].id Source event id
event_name events.data[].name Event name
person_id events.data[].person[].id Proposal-aligned participant person id
event_person_role events.data[].person[].pivot.role Proposal-aligned participation role
event_type events.data[].event_type Event type
event_scale_type events.data[].type Proposal-aligned scale/type field
event_from events.data[].from Event start
event_to events.data[].to Event end
participant_start events.data[].person[].pivot.start Proposal-aligned participant start
participant_end events.data[].person[].pivot.end Proposal-aligned participant end
requested events.data[].person[].pivot.requested Proposal-aligned participant requested flag
event_organization_id events.data[].organization Proposal target retained; live payload needs confirmation because organization may appear as string/null
event_organization_name /api/api/system/organizations and /api/api/system/organizations/{orgid} Proposal-aligned organization name enrichment
venue events.data[].venue Event venue
tags_json events.data[].tags Store sanitized JSON

Reference verification from the join audit:

  • Event participant rows observed: 205750
  • person[].id = person[].pivot.person_id matches: 205750
  • Null event person rows: 0
  • Orphan event person rows after same-run people refresh: 0
  • Duplicate (event_id, person_id) keys observed: 14
  • Duplicate (event_id, person_id, role) keys observed: 14
  • Duplicate strict participant keys observed: 13

Constraint decision:

  • Keep the local surrogate id.
  • Do not add a strict unique constraint on (event_id, person_id, event_person_role) in the first migration because the source currently has duplicate participant rows.
  • Add deduplication rules deliberately after reviewing duplicate source rows.

Core Join Schematic

This is the schematic that should be included directly in the GitHub issue. GitHub issues support Mermaid code blocks, so this should render as a diagram on GitHub.com.

flowchart LR
    person["rrms_person_snapshot<br/>PK: person_id<br/>Fields: person_status, sex, current_availability, outofscope, organization_id, roles_json, languages_json, tags_json"]
    appraisal["deployments_molnixappraisal<br/>UK: molnix_id<br/>Join field: appraised_person_id<br/>Context: target_id, deployment_molnix_id, stage, score, competencies_json"]
    appraiser["deployments_molnixappraiser<br/>UK: molnix_id<br/>Parent join: appraisal_molnix_id<br/>Fields: appraiser_type, required, completed_at"]
    event["rrms_event_participation<br/>Join field: person_id<br/>Fields: event_id, event_name, event_person_role, event_type, event_scale_type, participant_start, participant_end"]
    personnel["deployments_personnel<br/>Current bridge: molnix_id<br/>Local fields: deployedperson_ptr_id, deployment_id, molnix_status"]
    person -->|"person_id = appraised_person_id"| appraisal
    appraisal -->|"molnix_id = appraisal_molnix_id"| appraiser
    person -->|"person_id = person_id"| event
    person -->|"current: person_id = molnix_id (LEFT JOIN)"| personnel
Loading

Join Contract

Core joins and current GO bridge

Join Relationship Status Notes
rrms_person_snapshot.person_id = deployments_molnixappraisal.appraised_person_id Person to appraisals Supported with schema fix Requires new appraised_person_id field
deployments_molnixappraisal.molnix_id = deployments_molnixappraiser.appraisal_molnix_id Appraisal to appraisers Supported Appraisers are children of appraisal
rrms_person_snapshot.person_id = rrms_event_participation.person_id Person to event participation Supported Requires refreshed person snapshot for all referenced ids
rrms_person_snapshot.person_id = deployments_personnel.molnix_id RRMS person to current GO personnel bridge Supported with caveat Current analytical bridge only. Use LEFT JOIN. Not every RRMS person has a local personnel row, and some historical molnix_id values no longer resolve in live RRMS.

Local GO Database Bridge

Current bridge

Current analytical join:

rrms_person_snapshot.person_id = deployments_personnel.molnix_id

Use LEFT JOIN from RRMS people into deployments_personnel because:

  • Many RRMS people may not have a local deployment/personnel row.
  • Some local deployments_personnel rows have no molnix_id.
  • Some historical local molnix_id values no longer resolve in live RRMS.

Reference verification:

Check Result
Total deployments_personnel rows 4024
Rows with non-null molnix_id 3400
Distinct non-null molnix_id values 3400
Duplicate non-null molnix_id keys 0
Live RRMS people resolved from molnix_id 3192
Live RRMS people unresolved from molnix_id 208
Unresolved status counts active=187, deleted=12, hidden=9

Existing local joins

These joins were verified against the existing database and can be used when local GO context is needed:

Join Verification
deployments_personnel.deployedperson_ptr_id = deployments_deployedperson.id Verified; no orphan rows in audit
deployments_personnel.deployment_id = deployments_personneldeployment.id Verified; no orphan rows in audit
deployments_personnel.surge_alert_id = notifications_surgealert.id Verified for non-null values; optional for most rows

The database metadata queried during analysis did not expose declared FK constraints for these tables, even though the data behaves consistently for these joins.

Validation And Acceptance Criteria

Endpoint ingestion

  • Appraisals ingestion follows pagination until next_page_url is null or current_page == last_page.
  • Events ingestion follows pagination until next_page_url is null or current_page == last_page.
  • Person-scoped endpoints are called for required people and do not assume every RRMS person already exists in a cached range.
  • Organization lookup is nullable and defensive.

Table population

  • rrms_person_snapshot.person_id is populated for all referenced person ids.
  • deployments_molnixappraisal.molnix_id is unique and populated.
  • deployments_molnixappraisal.appraised_person_id is populated from appraisal.deployment.person_id.
  • deployments_molnixappraiser.appraisal_molnix_id points to a loaded deployments_molnixappraisal.molnix_id.
  • rrms_event_participation.person_id matches the event participant pivot person id.
  • Appraiser PII fields are not persisted.

Join checks

Validation must report:

  • 0 duplicate deployments_molnixappraisal.molnix_id values.
  • 0 null deployments_molnixappraisal.appraised_person_id values, unless Molnix sends malformed rows that are deliberately quarantined.
  • 0 orphan deployments_molnixappraisal.appraised_person_id values after same-run people refresh.
  • 0 parent mismatches from deployments_molnixappraiser.appraisal_molnix_id to deployments_molnixappraisal.molnix_id.
  • 0 event participant rows where person[].id != person[].pivot.person_id.
  • 0 orphan rrms_event_participation.person_id values after same-run people refresh.
  • Duplicate event participation rows are reported, not silently hidden.
  • The local DB bridge reports matched and unresolved deployments_personnel.molnix_id values separately.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions