diff --git a/docs/developer-guide/metadatakeys-module.md b/docs/developer-guide/metadatakeys-module.md index f41dbe293..9440a1c08 100644 --- a/docs/developer-guide/metadatakeys-module.md +++ b/docs/developer-guide/metadatakeys-module.md @@ -17,74 +17,179 @@ The previous implementation in the Datasets service lacked a permission-based fi - Stability: Crashes occurred when retrieval limits were missing or improperly configured. - Risks: Users could see metadata keys they did not have permissions to access. +--- + ## Module Architecture This module consists of a dedicated Controller and Service layer that implements a robust permission-aware logic. ### MetadataKeysController -Provides the API interface for searching keys. Allowed filters can be found in `src/metadata-keys/metadatakeys.service.ts` and exmaple can be find in `src/metadata-keys/types/metadatakeys-filter-content.ts` +Provides the API interface for searching metadata keys. -- `Endpoint`: GET /metadatakeys (replaces /datasets/metadataKeys) -- `Method`: findAll -- `Endpoint Access`: Endpoint can be Accessed by any users +- **Endpoint**: `GET /metadatakeys` (replaces `GET /datasets/metadataKeys`) +- **Method**: `findAll` +- **Access**: Any authenticated user (permission filtering is applied server-side) +- Allowed filter fields: see `src/metadata-keys/types/metadatakeys-lookup.ts` +- Filter examples: see `src/metadata-keys/types/metadatakeys-filter-content.ts` + +--- ### MetadataKeysService -This handles the business logic and talks to the database. It is divided into user-facing search logic and internal data synchronization. +Handles business logic and database access. Split into two concerns: + +#### 1. User-facing search — `findAll` + +Applies CASL permission filters before querying: + +| User type | Visible keys | +| -------------------- | -------------------------------------------------------- | +| Admin | All keys in the system | +| Authenticated user | Keys where they belong to `ownerGroup` or `accessGroups` | +| Unauthenticated user | Keys marked `isPublished: true` | + +Results default to 100 per page if no limit is provided. + +#### 2. Internal synchronization + +These methods are called internally when source documents are created, updated, or deleted. They are never called directly from the controller. + +##### `insertManyFromSource(doc)` + +Called when a dataset is **created** or **gains new metadata keys**. + +For each key in `scientificMetadata`: + +- Upserts a `MetadataKey` document identified by `${sourceType}_${key}_${humanReadableName}` +- Increments `usageCount` (total datasets referencing this key) +- Increments per-group reference counts in `userGroupCounts` +- Adds new groups to the `userGroups` query array via `$addToSet` +- Sets `isPublished: true` if the source dataset is published (never unsets inline — the cronjob handles the `true → false` transition) + +##### `deleteMany(doc)` -#### Permission Layer (Applies to findAll only): +Called when a dataset is **deleted** or **loses metadata keys**. -When a user searches for keys, the service uses accessibleBy to automatically append access filters based on CASL permissions: +Runs three sequential steps: -- `Admins`: Can search and get all metadata keys in the system. -- `Authenticated Users`: Can only get keys where they are part of the ownerGroup or accessGroups. -- `Unauthenticated Users`: Can only get keys that are marked as isPublished. +1. Decrements `usageCount` and per-group counts in `userGroupCounts` +2. Recomputes the `userGroups` array from the updated counts — drops any group whose count reached zero +3. Deletes `MetadataKey` documents where `usageCount <= 0` + `usageCount` is the authoritative deletion signal. A dataset with no `userGroups` and `isPublished: false` would be invisible to both `userGroupCounts` and `isPublished` checks, so neither alone can substitute for it. -#### Service Methods: +##### `replaceManyFromSource(oldDoc, newDoc)` -- `findAll`: The only public-facing method. It applies the permission layer and then uses a database aggregation pipeline to find and return the specific keys requested by the user. Every search is limited to 100 results by default, if limit is not provided. -- `insertManyFromSource`: An internal method that takes an original document (like a Dataset), extracts fields from **scientificMetadata**, **metadata**, and **customMetadata**, and creates new records in the Metadata Keys collection. -- `deleteMany`: Removes metadata key entries associated with a source document when that document is deleted from the system. -- `replaceManyFromSource`: Triggered when a source document (e.g., a Dataset or Proposal) is updated. It calls `deleteMany` and `insertManyFromSource` sequentially. +Called when a dataset is **updated**. Diffs the old and new `scientificMetadata` to produce three disjoint key sets: -## Usage Example +| Set | Keys | Action | +| ------- | ---------------- | -------------------------------------------------------------------- | +| Added | Only in `newDoc` | `insertManyFromSource` | +| Removed | Only in `oldDoc` | `deleteMany` | +| Shared | In both | `updateSharedKeys` (group / isPublished / humanReadableName changes) | -To list all metadata keys associated with a dataset, the user must provide the sourceType and sourceId. If the fields array is provided, only those specific fields will be returned: +The three sets are disjoint by `_id` so they run in parallel via `Promise.all`. + +For shared keys, three things are handled independently: + +- **userGroups changed** — added groups are incremented, removed groups are decremented, then `userGroups` array is recomputed from the updated counts +- **isPublished flipped true** — sets `isPublished: true` inline; `false` is left to the cronjob +- **humanReadableName changed** — since `humanReadableName` is part of `_id`, this is treated as a delete of the old document + insert of a new one + +--- + +## Schema + +Each `MetadataKey` document has the following key fields: + +| Field | Type | Description | +| ------------------- | --------------------- | ---------------------------------------------------------------------------------------- | +| `_id` | `string` | Composite key: `${sourceType}_${key}_${humanReadableName}` | +| `key` | `string` | The raw metadata key name | +| `humanReadableName` | `string` | Human-readable label from `human_name`, empty string if absent | +| `sourceType` | `string` | Source collection: `Dataset`, `Proposal`, `Sample`, etc. | +| `userGroups` | `string[]` | Groups that can see this key — kept in sync with `userGroupCounts` for query performance | +| `userGroupCounts` | `Map` | Per-group reference counts — source of truth for safe group removal | +| `usageCount` | `number` | Total datasets referencing this key — authoritative deletion signal | +| `isPublished` | `boolean` | True if any contributing dataset is published | + +`userGroups` and `userGroupCounts` are intentionally redundant. `userGroupCounts` owns the truth and enables safe atomic decrements. `userGroups` is a denormalized array kept for query performance — MongoDB's multikey index on `userGroups` makes `{ userGroups: { $in: [...] } }` efficient in a way that querying Map keys directly is not. + +--- + +## Filter Examples + +List metadata keys visible to the current user for a given source type: ```json { "where": { - "sourceType": "dataset", - "sourceId": "datasetId" + "sourceType": "Dataset" }, - "fields": ["humanreadableName", "key"], + "fields": ["key", "humanReadableName"], "limits": { "limit": 10, "skip": 0, "sort": { - "createdAt": "asc | desc" + "createdAt": "desc" } } } ``` -To retrieve a specific metadata key, use the following filter: +Find a specific key by name: + +```json +{ + "where": { + "sourceType": "Dataset", + "key": "temperature" + }, + "limits": { + "limit": 1, + "skip": 0 + } +} +``` + +Partial search on `key`: ```json { "where": { - "sourceType": "dataset", - "sourceId": "datasetId", - "key": "metadata_key_name" + "sourceType": "Dataset", + "key": { "$regex": "temp", "$options": "i" } }, - "fields": ["key"], "limits": { "limit": 10, - "skip": 0, - "sort": { - "createdAt": "asc | desc" - } + "skip": 0 } } ``` + +Partial search on `humanReadableName`: + +```json +{ + "where": { + "sourceType": "Dataset", + "humanReadableName": { "$regex": "temp", "$options": "i" } + }, + "limits": { + "limit": 10, + "skip": 0 + } +} +``` + +--- + +## Initial Migration + +The `MetadataKeys` collection is populated by a migration script that must be run manually before the service is deployed for the first time. + +See: `migrations/20260417145401-sync-dataset-scientificMetadata-to-metadatakeys.js` + +Documentation: `migrations/20260417145401-sync-dataset-scientificMetadata-to-metadatakeys.md` + +> ⚠️ The application will start normally without the migration, but the MetadataKeys service will return empty results until it is run. diff --git a/docs/developer-guide/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.md b/docs/developer-guide/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.md new file mode 100644 index 000000000..250483db1 --- /dev/null +++ b/docs/developer-guide/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.md @@ -0,0 +1,511 @@ +# 20260417145401 — Sync Dataset scientificMetadata to MetadataKeys + +## What this migration does + +Rebuilds the `MetadataKeys` collection from scratch by scanning all `Dataset` documents and extracting every key found in `scientificMetadata`. + +Each unique `(sourceType, key, humanReadableName)` combination becomes one `MetadataKey` document. If the same key appears across multiple datasets, their `userGroups` and counts are merged into a single entry. + +--- + +## Why it exists + +The `MetadataKeys` collection powers metadata key search and access control. This is the initial population of the collection — it must be run once before the service can operate. + +Each `MetadataKey` document tracks: + +- `userGroupCounts: Map` — how many datasets per group reference this key, enabling safe atomic group removal when a dataset is updated or deleted +- `usageCount: number` — total datasets referencing this key regardless of groups, used as the authoritative deletion signal + +--- + +## Source data shape + +```js +// Dataset document +{ + ..., + _id: "uuid-A", + ownerGroup: "group-1", // mandatory + accessGroups: ["group-2"], // optional + isPublished: true, + scientificMetadata: { + temperature: { value: 100, unit: "C", human_name: "Temperature" }, + pressure: { value: 1, unit: "bar" }, // no human_name + } +} +``` + +--- + +## MetadataKey shape + +```js +// MetadataKey document +{ + _id: "550e8400-e29b-41d4-a716-446655440000", + id:"550e8400-e29b-41d4-a716-446655440000", + key: "temperature", + humanReadableName: "Temperature", + sourceType: "Dataset", + isPublished: true, + usageCount: 2, + userGroups: ["group-1", "group-2"], + userGroupCounts: { "group-1": 2, "group-2": 1 }, + createdBy: "migration", + createdAt: ISODate("...") +} +``` + +--- + +## Migration Pipeline walkthrough + +It builds a MetadataKeys collection by extracting and aggregating scientific metadata keys from datasets. Each document in MetadataKeys represents one unique metadata key, enriched with access group membership, usage counts, and publication status. + +--- + +### Stage 1 — Flatten scientificMetadata into an array + +```js +{ + $project: { + datasetId: "$_id", + ownerGroup: 1, + accessGroups: 1, + isPublished: 1, + metaArr: { $objectToArray: "$scientificMetadata" }, + }, +} +``` + +**What it does:** Converts the scientificMetadata object into an array of {k, v} pairs so it can be unwound in the next stage. Preserves \_id as datasetId for later use in usage counting. + +**Input** + +```js +{ + "_id": "ds1", + "ownerGroup": "groupA", + "accessGroups": ["groupB"], + "isPublished": false, + "scientificMetadata": { + "temperature": { "human_name": "Temperature", "value": 100 }, + "pressure": { "human_name": "Pressure", "value": 200 } + } +} +``` + +**Output** + +```js +{ + "datasetId": "ds1", + "ownerGroup": "groupA", + "accessGroups": ["groupB"], + "isPublished": false, + "metaArr": [ + { "k": "temperature", "v": { "human_name": "Temperature", "value": 100 } }, + { "k": "pressure", "v": { "human_name": "Pressure", "value": 200 } } + ] +} +``` + +--- + +### Stage 2 — One document per metadata key + +```js +{ + $unwind: "$metaArr"; +} +``` + +**What it does:** Produces one document per metadata key entry. A dataset with N metadata keys becomes N documents. + +**Input (from stage1)** + +```js +{ + "datasetId": "ds1", + "metaArr": [ + { "k": "temperature", "v": { "human_name": "Temperature" } }, + { "k": "pressure", "v": { "human_name": "Pressure" } } + ] +} +``` + +**Output** + +```js +{ "datasetId": "ds1", "metaArr": { "k": "temperature", "v": { "human_name": "Temperature" } } } +{ "datasetId": "ds1", "metaArr": { "k": "pressure", "v": { "human_name": "Pressure" } } } +``` + +--- + +### Stage 3 — Shape each document (datasetId+key) with HRM and userGroups + +```js +{ + $project: { + datasetId: 1, + key: "$metaArr.k", + isPublished: 1, + humanReadableName: { $ifNull: ["$metaArr.v.human_name", ""] }, + userGroups: { + $setUnion: [["$ownerGroup"], { $ifNull: ["$accessGroups", []] }], + }, + }, +} +``` + +**What it does:** Extracts the key name and human-readable name. Computes userGroups as the union of ownerGroup and accessGroups — every group that has access to this dataset. + +**Input (from stage2)** + +```js +{ + "datasetId": "ds1", + "ownerGroup": "groupA", + "accessGroups": ["groupB"], + "isPublished": false, + "metaArr": { "k": "temperature", "v": { "human_name": "Temperature" } } +} +``` + +**output** + +```js +{ + "datasetId": "ds1", + "key": "temperature", + "humanReadableName": "Temperature", + "isPublished": false, + "userGroups": ["groupA", "groupB"] +} +``` + +--- + +### Stage 4 — One document per (dataset+key+group) + +```js +{ + $unwind: { + path: "$userGroups", + }, +} +``` + +**What it does:** Split userGroups so each group gets its own document. This allows grouping by (key, group) in Stage 6. + +**Input (from stage3)** + +```js +{ + "datasetId": "ds1", + "key": "temperature", + "userGroups": ["groupA", "groupB"] +} +``` + +**output** + +```js +{ "datasetId": "ds1", "key": "temperature", "userGroups": "groupA" } +{ "datasetId": "ds1", "key": "temperature", "userGroups": "groupB" } +``` + +--- + +### Stage 5 — Group by (metaKeyId, group) + +```js +{ + $group: { + _id: { + metaKeyId: { $concat: [`${sourceType}_`, "$key", "_", "$humanReadableName"] }, + group: "$userGroups", + }, + key: { $first: "$key" }, + humanReadableName: { $first: "$humanReadableName" }, + isPublished: { $max: "$isPublished" }, + groupCount: { $sum: 1 }, + datasetIds: { $addToSet: "$datasetId" }, + }, +} +``` + +**What it does:** Groups by (metadata key, group) pair. Computes: + +- `metaKeyId` is a stable, deterministic identifier derived from `${sourceType}_${key}_${humanReadableName}` used as the merge key in Stage 9 to prevent duplicate documents across pipeline runs. +- `groupCount` indicates how many datasets with this group use this key +- `datasetIds` includes distinct dataset IDs for this group, used later to count unique datasets across all groups without double-counting + +**Input (from stage4)** + +```js +{ "datasetId": "ds1", "key": "temperature", "humanReadableName": "Temperature", "userGroups": "groupA", "isPublished": false } +{ "datasetId": "ds2", "key": "temperature", "humanReadableName": "Temperature", "userGroups": "groupA", "isPublished": true } +{ "datasetId": "ds1", "key": "temperature", "humanReadableName": "Temperature", "userGroups": "groupB", "isPublished": false } +``` + +**Output** + +```js +{ + "_id": { "metaKeyId": "dataset_temperature_Temperature", "group": "groupA" }, + "key": "temperature", + "humanReadableName": "Temperature", + "isPublished": true, + "groupCount": 2, + "datasetIds": ["ds1", "ds2"] +} +{ + "_id": { "metaKeyId": "dataset_temperature_Temperature", "group": "groupB" }, + "key": "temperature", + "humanReadableName": "Temperature", + "isPublished": false, + "groupCount": 1, + "datasetIds": ["ds1"] +} +``` + +--- + +### Stage 6 — Group by metaKeyId + +```js +{ + $group: { + _id: "$_id.metaKeyId", + key: { $first: "$key" }, + humanReadableName: { $first: "$humanReadableName" }, + isPublished: { $max: "$isPublished" }, + userGroups: { $push: "$_id.group" }, + userGroupCountsArr: { $push: { k: "$_id.group", v: "$groupCount" } }, + datasetIdSets: { $push: "$datasetIds" }, + }, +} +``` + +**What it does:** Reassembles one document per metadata key by collecting all per-group data. datasetIdSets is a list of per-group dataset ID sets — merged in Stage 8 to compute total unique dataset count. + +**Input (from stage5)** + +```js +{ "_id": { "metaKeyId": "dataset_temperature_Temperature", "group": "groupA" }, "groupCount": 2, "datasetIds": ["ds1", "ds2"] } +{ "_id": { "metaKeyId": "dataset_temperature_Temperature", "group": "groupB" }, "groupCount": 1, "datasetIds": ["ds1"] } +``` + +**Output** + +```js +{ + "_id": "dataset_temperature_Temperature", + "key": "temperature", + "humanReadableName": "Temperature", + "isPublished": true, + "userGroups": ["groupA", "groupB"], + "userGroupCountsArr": [ + { "k": "groupA", "v": 2 }, + { "k": "groupB", "v": 1 } + ], + "datasetIdSets": [["ds1", "ds2"], ["ds1"]] +} +``` + +--- + +### Stage 7 — Add generated UUID + +```js +{ + $addFields: { + metaKeyId: "$_id", + generatedId: { + $function: { + body: "function() { return UUID().toString().replace('UUID(\"', '').replace('\")', ''); }", + args: [], + lang: "js", + }, + }, + }, +} +``` + +**What it does:** Saves `_id` which at this point is `${sourceType}_${key}_${humanReadableName}` as metaKeyId before it gets replaced. Generates a UUID for \_id to support future document splitting when a document approaches MongoDB's 16MB size limit, while metaKeyId remains the stable merge key for Stage 8. + +**Input (from stage7)** + +```json +{ "_id": "dataset_temperature_Temperature", ... } +``` + +**Output** + +```json +{ + "_id": "dataset_temperature_Temperature", + "metaKeyId": "dataset_temperature_Temperature", + "generatedId": "550e8400-e29b-41d4-a716-446655440000", + ... +} +``` + +--- + +### Stage 8 — Project final document shape + +```js +{ + $project: { + _id: "$generatedId", + metaKeyId: 1, + key: 1, + sourceType: { $literal: sourceType }, + humanReadableName: 1, + isPublished: 1, + userGroups: 1, + userGroupCounts: { $arrayToObject: "$userGroupCountsArr" }, + usageCount: { + $size: { + $reduce: { + input: "$datasetIdSets", + initialValue: [], + in: { $setUnion: ["$$value", "$$this"] }, + }, + }, + }, + createdBy: { $literal: "migration" }, + createdAt: { $toDate: "$$NOW" }, + }, +} +``` + +**What it does:** Produces the final document shape for MetadataKeys. Converts userGroupCountsArr to an object map. Computes usageCount by merging all per-group datasetIdSets into a single set and counting — this avoids double-counting datasets that belong to multiple groups. + +**Why set union for usageCount:** + +```js +datasetIds: ["ds1", "ds2"]; //groupA +datasetIds: ["ds1"]; // groupB +union: ["ds1", "ds2"]; // usageCount = 2 (not 3) +``` + +**Input (from stage7)** + +```js +{ + "generatedId": "550e8400-e29b-41d4-a716-446655440000", + "metaKeyId": "dataset_temperature_Temperature", + "userGroupCountsArr": [{ "k": "groupA", "v": 2 }, { "k": "groupB", "v": 1 }], + "datasetIdSets": [["ds1", "ds2"], ["ds1"]] +} +``` + +**Output** + +```js +{ + "_id": "550e8400-e29b-41d4-a716-446655440000", + "metaKeyId": "dataset_temperature_Temperature", + "key": "temperature", + "sourceType": "dataset", + "humanReadableName": "Temperature", + "isPublished": true, + "userGroups": ["groupA", "groupB"], + "userGroupCounts": { "groupA": 2, "groupB": 1 }, + "usageCount": 2, + "createdBy": "migration", + "createdAt": "2026-05-07T00:00:00.000Z" +} +``` + +--- + +### Stage 9 — Merge into MetadataKeys + +```js +{ + $merge: { + into: "MetadataKeys", + on: "metaKeyId", + whenMatched: [ + { + $replaceWith: { + $mergeObjects: [ + "$$new", + { _id: "$_id" } + ] + } + } + ], + whenNotMatched: "insert", + }, +} +``` + +**What it does:** Upserts each document into `MetadataKeys` using `metaKeyId` as the match key. On match, replaces the existing document entirely with the incoming one, preserving only `_id` since MongoDB does not allow changing it. + +**Input (from Stage 9)** + +```json +{ + "_id": "550e8400-e29b-41d4-a716-446655440000", + "metaKeyId": "dataset_temperature_Temperature", + "userGroups": ["groupA", "groupB"], + "userGroupCounts": { "groupA": 2, "groupB": 1 }, + "usageCount": 2, + "isPublished": true +} +``` + +**Output (inserted or replaced):** + +```json +{ + "_id": "550e8400-e29b-41d4-a716-446655440000", + "metaKeyId": "dataset_temperature_Temperature", + "userGroups": ["groupA", "groupB"], + "userGroupCounts": { "groupA": 2, "groupB": 1 }, + "usageCount": 2, + "isPublished": true +} +``` + +## Running the migration + +```bash +# Run manually in production — ideally during low-traffic hours. +# The migration is slow but non-blocking: the app continues to serve +# requests while it runs. However, MetadataKeys will be unavailable +# for the duration since the collection is wiped at the start. +npm run migrate:db:up + +# verify +db.MetadataKeys.countDocuments({ userGroupCounts: { $exists: true } }) +# should equal +db.MetadataKeys.countDocuments() +``` + +> ⚠️ **Do not interrupt.** The migration wipes `MetadataKeys` at the start with `deleteMany`. If interrupted, re-run `migrate:db:up` — the wipe ensures a clean slate on retry. + +--- + +## Rollback + +```bash +npm run migrate:db:down +``` + +Wipes the entire `MetadataKeys` collection. The collection will be repopulated on the next `migrate:db:up`. + +Verify the rollback succeeded by checking the migration status: + +```bash +npm run migrate:db:status +``` + +If the migration shows as `pending` it means the rollback was successful and the migration has not been run yet. diff --git a/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.js b/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.js new file mode 100644 index 000000000..a42626c7c --- /dev/null +++ b/migrations/20260420145401-sync-dataset-scientificMetadata-to-metadatakeys.js @@ -0,0 +1,212 @@ +const SOURCE_COLLECTIONS = ["Dataset"]; +const BATCH_SIZE = 10000; + +function buildPipeline(sourceType) { + return [ + { + $project: { + datasetId: "$_id", + ownerGroup: 1, + accessGroups: 1, + isPublished: 1, + metaArr: { $objectToArray: "$scientificMetadata" }, + }, + }, + { $unwind: "$metaArr" }, + { + $project: { + datasetId: 1, + key: "$metaArr.k", + isPublished: 1, + humanReadableName: { $ifNull: ["$metaArr.v.human_name", ""] }, + userGroups: { + $setUnion: [["$ownerGroup"], { $ifNull: ["$accessGroups", []] }], + }, + }, + }, + { + $unwind: { + path: "$userGroups", + }, + }, + { + $group: { + _id: { + metaKeyId: { + $concat: [`${sourceType}_`, "$key", "_", "$humanReadableName"], + }, + group: "$userGroups", + }, + key: { $first: "$key" }, + humanReadableName: { $first: "$humanReadableName" }, + isPublished: { $max: "$isPublished" }, + groupCount: { $sum: 1 }, + datasetIds: { $addToSet: "$datasetId" }, + }, + }, + { + $group: { + _id: "$_id.metaKeyId", + key: { $first: "$key" }, + humanReadableName: { $first: "$humanReadableName" }, + isPublished: { $max: "$isPublished" }, + userGroups: { $push: "$_id.group" }, + userGroupCountsArr: { + $push: { k: "$_id.group", v: "$groupCount" }, + }, + datasetIdSets: { $push: "$datasetIds" }, + }, + }, + { + $addFields: { + metaKeyId: "$_id", + generatedId: { + $function: { + body: "function() { return UUID().toString().replace('UUID(\"', '').replace('\")', ''); }", + args: [], + lang: "js", + }, + }, + }, + }, + { + $project: { + _id: "$generatedId", + metaKeyId: 1, + key: 1, + sourceType: { $literal: sourceType }, + humanReadableName: 1, + isPublished: 1, + userGroups: 1, + userGroupCounts: { $arrayToObject: "$userGroupCountsArr" }, + usageCount: { + $size: { + $reduce: { + input: "$datasetIdSets", + initialValue: [], + in: { $setUnion: ["$$value", "$$this"] }, + }, + }, + }, + createdBy: { $literal: "migration" }, + createdAt: { $toDate: "$$NOW" }, + }, + }, + { + $merge: { + into: "MetadataKeys", + on: "metaKeyId", + whenMatched: [ + { + $replaceWith: { + $mergeObjects: ["$$new", { _id: "$_id" }], + }, + }, + ], + whenNotMatched: "insert", + }, + }, + ]; +} + +module.exports = { + async up(db) { + const start = Date.now(); + const elapsed = () => `${((Date.now() - start) / 1000).toFixed(1)}s`; + + // Wipe MetadataKeys collection first to ensure a clean state + const deleted = await db.collection("MetadataKeys").deleteMany({}); + + await db + .collection("MetadataKeys") + .createIndex({ metaKeyId: 1 }, { unique: true }); + + console.log( + `[${elapsed()}] Cleared ${deleted.deletedCount} existing MetadataKeys`, + ); + + for (const collection of SOURCE_COLLECTIONS) { + const total = await db.collection(collection).countDocuments({ + scientificMetadata: { $exists: true, $type: "object" }, + }); + + if (total === 0) { + console.log( + `[${elapsed()}] No documents with scientificMetadata in ${collection}, skipping...`, + ); + continue; + } + + console.log( + `[${elapsed()}] Processing ${total.toLocaleString()} documents from ${collection}...`, + ); + + let lastId = null; + let processed = 0; + + while (true) { + const match = { + scientificMetadata: { $exists: true, $type: "object" }, + ...(lastId && { _id: { $gt: lastId } }), + }; + + const batch = await db + .collection(collection) + .find(match) + .sort({ _id: 1 }) + .limit(BATCH_SIZE) + .project({ _id: 1 }) + .toArray(); + + if (batch.length === 0) break; + + const batchIds = batch.map((d) => d._id); + + await db + .collection(collection) + .aggregate( + [ + { $match: { _id: { $in: batchIds } } }, + ...buildPipeline(collection), + ], + { allowDiskUse: true, maxTimeMS: 0 }, + ) + .toArray(); + + lastId = batch[batch.length - 1]._id; + processed += batch.length; + + console.log( + `[${elapsed()}] ${collection}: ${processed.toLocaleString()}/${total.toLocaleString()}`, + ); + } + + console.log(`[${elapsed()}] ✅ ${collection} done`); + } + + await db.collection("MetadataKeys").dropIndex("metaKeyId_1"); + await db + .collection("MetadataKeys") + .updateMany({}, [{ $set: { id: "$_id" } }, { $unset: ["metaKeyId"] }]); + + const result = await db.collection("MetadataKeys").countDocuments(); + console.log( + `[${elapsed()}] Migration completed — Total MetadataKeys: ${result.toLocaleString()}`, + ); + }, + + async down(db) { + const start = Date.now(); + const elapsed = () => `${((Date.now() - start) / 1000).toFixed(1)}s`; + + const total = await db.collection("MetadataKeys").countDocuments(); + console.log( + `[${elapsed()}] Deleting ${total.toLocaleString()} MetadataKeys...`, + ); + + const deleted = await db.collection("MetadataKeys").deleteMany({}); + console.log( + `[${elapsed()}] Rollback completed — Deleted ${deleted.deletedCount} MetadataKeys`, + ); + }, +}; diff --git a/src/common/utils.ts b/src/common/utils.ts index 8c6b67ea0..90eb96f75 100644 --- a/src/common/utils.ts +++ b/src/common/utils.ts @@ -12,6 +12,7 @@ import { import { ScientificRelation } from "./scientific-relation.enum"; import { DatasetType } from "src/datasets/types/dataset-type.enum"; import { isPlainObject, mapValues, omit, pickBy, some } from "lodash"; +import { MetadataSourceDoc } from "src/metadata-keys/metadatakeys.service"; // add Å to mathjs accepted units as equivalent to angstrom const isAlphaOriginal = Unit.isValidAlpha; @@ -1343,3 +1344,34 @@ export function parseDate(dateString?: string): Date | undefined { const parsedDate = new Date(dateString); return isNaN(parsedDate.getTime()) ? undefined : parsedDate; } + +export function createMetadataKeysInstance( + sourceType: string, + doc: { + ownerGroup?: string; + accessGroups?: string[]; + isPublished?: boolean; + scientificMetadata?: Record; + metadata?: Record; + customMetadata?: Record; + sampleCharacteristics?: Record; + }, +): MetadataSourceDoc { + return { + sourceType, + userGroups: Array.from( + new Set( + [doc.ownerGroup, ...(doc.accessGroups ?? [])].filter( + Boolean, + ) as string[], + ), + ), + isPublished: doc.isPublished ?? false, + metadata: + doc.scientificMetadata ?? + doc.metadata ?? + doc.customMetadata ?? + doc.sampleCharacteristics ?? + {}, + }; +} diff --git a/src/datasets/datasets.service.spec.ts b/src/datasets/datasets.service.spec.ts index 45888e188..c8d087a53 100644 --- a/src/datasets/datasets.service.spec.ts +++ b/src/datasets/datasets.service.spec.ts @@ -98,6 +98,15 @@ const mockDataset: DatasetClass = { dataQualityMetrics: 1, }; +const mockDatasetModel = function (data: DatasetClass) { + return { + ...data, + save: jest.fn().mockResolvedValue(data), + toObject: jest.fn().mockReturnValue(data), + }; +}; +mockDatasetModel.collection = { name: "Dataset" }; + describe("DatasetsService", () => { let service: DatasetsService; let model: Model; @@ -108,13 +117,7 @@ describe("DatasetsService", () => { ConfigService, { provide: getModelToken("DatasetClass"), - useValue: function (data: DatasetClass) { - return { - ...data, - save: jest.fn().mockResolvedValue(data), - toObject: jest.fn().mockReturnValue(data), - }; - }, + useValue: mockDatasetModel, }, DatasetsService, DatasetsAccessService, @@ -175,11 +178,11 @@ describe("DatasetsService", () => { ).toBe("Already Encoded"); }); - it("should throw NotFoundException if no document is found for update and no unmodifiedSince is provided", async () => { + it("should throw NotFoundException if no document is found", async () => { const updateDto = { datasetName: "Updated Name" }; - model.findOneAndUpdate = jest + model.findOne = jest .fn() - .mockReturnValue({ exec: jest.fn().mockReturnValue(null) }); + .mockReturnValue({ exec: jest.fn().mockResolvedValue(null) }); await expect( service.findByIdAndUpdate("testId", updateDto), ).rejects.toThrow(NotFoundException); @@ -188,6 +191,9 @@ describe("DatasetsService", () => { it("should throw PreconditionedFailed if no patched dataset is returned (indicating a concurrent modification)", async () => { const updateDto = { datasetName: "Updated Name" }; const unmodifiedSince = new Date("2021-11-11T12:29:02.083Z"); + model.findOne = jest + .fn() + .mockReturnValue({ exec: jest.fn().mockResolvedValue(mockDataset) }); model.findOneAndUpdate = jest .fn() .mockReturnValue({ exec: jest.fn().mockReturnValue(null) }); diff --git a/src/datasets/datasets.service.ts b/src/datasets/datasets.service.ts index ab124ecb9..2b44783e4 100644 --- a/src/datasets/datasets.service.ts +++ b/src/datasets/datasets.service.ts @@ -36,6 +36,7 @@ import { parsePipelineProjection, parsePipelineSort, decodeMetadataKeyStrings, + createMetadataKeysInstance, } from "src/common/utils"; import { DatasetsAccessService } from "./datasets-access.service"; import { CreateDatasetDto } from "./dto/create-dataset.dto"; @@ -62,19 +63,15 @@ import { DatasetLookupKeysEnum, } from "./types/dataset-lookup"; import { ProposalsService } from "src/proposals/proposals.service"; -import { - MetadataKeysService, - MetadataSourceDoc, -} from "src/metadata-keys/metadatakeys.service"; +import { MetadataKeysService } from "src/metadata-keys/metadatakeys.service"; import { OpensearchService } from "src/opensearch/opensearch.service"; -import type { IndexSettings } from "@opensearch-project/opensearch/api/_types/indices._common"; -import type { TypeMapping } from "@opensearch-project/opensearch/api/_types/_common.mapping"; -import { BulkStats } from "@opensearch-project/opensearch/lib/Helpers"; +import { BulkStats } from "@opensearch-project/opensearch/lib/Helpers.js"; +import { IndexSettings } from "@opensearch-project/opensearch/api/_types/indices._common.js"; +import { TypeMapping } from "@opensearch-project/opensearch/api/_types/_common.mapping.js"; import { DatasetOpenSearchDto } from "src/opensearch/dto/dataset-opensearch.dto"; import { plainToInstance } from "class-transformer"; import { DATASET_OPENSEARCH_PROJECTION } from "../opensearch/utils/dataset-opensearch.utils"; import { withOCCFilter } from "./utils/occ-util"; - @Injectable({ scope: Scope.REQUEST }) export class DatasetsService { private readonly osDefaultIndex: string; @@ -100,20 +97,6 @@ export class DatasetsService { this.configService.get("opensearch.dataSyncBatchSize") || 1000; } - private createMetadataKeysInstance( - doc: UpdateQuery, - ): MetadataSourceDoc { - const source: MetadataSourceDoc = { - sourceType: "dataset", - sourceId: doc.pid, - ownerGroup: doc.owner, - accessGroups: doc.accessGroups || [], - isPublished: doc.isPublished || false, - metadata: doc.scientificMetadata ?? {}, - }; - return source; - } - addLookupFields( pipeline: PipelineStage[], datasetLookupFields?: (DatasetLookupKeysEnum | IDatasetRelation)[], @@ -218,7 +201,10 @@ export class DatasetsService { } this.metadataKeysService.insertManyFromSource( - this.createMetadataKeysInstance(savedDataset), + createMetadataKeysInstance( + this.datasetModel.collection.name, + savedDataset, + ), ); return savedDataset; @@ -503,7 +489,14 @@ export class DatasetsService { } await this.metadataKeysService.replaceManyFromSource( - this.createMetadataKeysInstance(updatedDataset), + createMetadataKeysInstance( + this.datasetModel.collection.name, + existingDataset, + ), + createMetadataKeysInstance( + this.datasetModel.collection.name, + updatedDataset, + ), ); // we were able to find the dataset and update it return updatedDataset; @@ -521,6 +514,11 @@ export class DatasetsService { ): Promise { const username = (this.request.user as JWTUser).username; + const existingDataset = await this.datasetModel.findOne({ pid: id }).exec(); + if (!existingDataset) { + throw new NotFoundException(`Dataset #${id} not found`); + } + // NOTE: When doing findByIdAndUpdate in mongoose it does reset the subdocuments to default values if no value is provided // https://stackoverflow.com/questions/57324321/mongoose-overwriting-data-in-mongodb-with-default-values-in-subdocuments let queryFilter: FilterQuery = { pid: id }; @@ -539,7 +537,7 @@ export class DatasetsService { // check if we were able to find the dataset (matching the precondition, if supplied) and update it if (!patchedDataset) { if (!unmodifiedSince) { - throw new NotFoundException(`Dataset #${id} not found`); + throw new NotFoundException(`Dataset #${id} failed to update.`); } throw new PreconditionFailedException( `Dataset #${id} has been modified on the server since ${unmodifiedSince.toUTCString()}.`, @@ -555,7 +553,14 @@ export class DatasetsService { } await this.metadataKeysService.replaceManyFromSource( - this.createMetadataKeysInstance(patchedDataset), + createMetadataKeysInstance( + this.datasetModel.collection.name, + existingDataset, + ), + createMetadataKeysInstance( + this.datasetModel.collection.name, + patchedDataset, + ), ); // we were able to find the dataset and update it return patchedDataset; @@ -584,10 +589,12 @@ export class DatasetsService { } // delete metadata keys associated with this dataset - await this.metadataKeysService.deleteMany({ - sourceId: id, - sourceType: "dataset", - }); + await this.metadataKeysService.deleteMany( + createMetadataKeysInstance( + this.datasetModel.collection.name, + deletedDataset, + ), + ); return deletedDataset; } diff --git a/src/instruments/instruments.service.ts b/src/instruments/instruments.service.ts index 7d961ced6..8b410e9cb 100644 --- a/src/instruments/instruments.service.ts +++ b/src/instruments/instruments.service.ts @@ -6,13 +6,14 @@ import { PreconditionFailedException, } from "@nestjs/common"; import { InjectModel } from "@nestjs/mongoose"; -import { FilterQuery, Model, UpdateQuery } from "mongoose"; +import { FilterQuery, Model } from "mongoose"; import { IFilters } from "src/common/interfaces/common.interface"; import { CountApiResponse } from "src/common/types"; import { parseLimitFilters, addCreatedByFields, addUpdatedByField, + createMetadataKeysInstance, } from "src/common/utils"; import { CreateInstrumentDto } from "./dto/create-instrument.dto"; import { PartialUpdateInstrumentDto } from "./dto/update-instrument.dto"; @@ -20,10 +21,7 @@ import { Instrument, InstrumentDocument } from "./schemas/instrument.schema"; import { JWTUser } from "src/auth/interfaces/jwt-user.interface"; import { REQUEST } from "@nestjs/core"; import { Request } from "express"; -import { - MetadataKeysService, - MetadataSourceDoc, -} from "src/metadata-keys/metadatakeys.service"; +import { MetadataKeysService } from "src/metadata-keys/metadatakeys.service"; import { withOCCFilter } from "src/datasets/utils/occ-util"; @Injectable({ scope: Scope.REQUEST }) @@ -35,29 +33,17 @@ export class InstrumentsService { @Inject(REQUEST) private request: Request, ) {} - private createMetadataKeysInstance( - doc: UpdateQuery, - ): MetadataSourceDoc { - const source: MetadataSourceDoc = { - sourceType: "instrument", - sourceId: doc.pid, - ownerGroup: doc.ownerGroup, - accessGroups: doc.accessGroups || [], - isPublished: doc.isPublished || false, - metadata: doc.customMetadata ?? {}, - }; - return source; - } - async create(createInstrumentDto: CreateInstrumentDto): Promise { const username = (this.request.user as JWTUser).username; const createdInstrument = new this.instrumentModel( addCreatedByFields(createInstrumentDto, username), ); const savedInstrument = await createdInstrument.save(); - await this.metadataKeysService.insertManyFromSource( - this.createMetadataKeysInstance(savedInstrument), + createMetadataKeysInstance(this.instrumentModel.collection.name, { + ...savedInstrument.toObject(), + isPublished: true, + }), ); return savedInstrument; @@ -104,6 +90,15 @@ export class InstrumentsService { unmodifiedSince?: Date, ): Promise { const username = (this.request.user as JWTUser).username; + const existingInstrument = await this.instrumentModel + .findOne(filter) + .exec(); + + if (!existingInstrument) { + throw new NotFoundException( + `Instrument not found with filter: ${JSON.stringify(filter)}`, + ); + } const queryFilter = withOCCFilter(filter, unmodifiedSince); @@ -132,7 +127,14 @@ export class InstrumentsService { } await this.metadataKeysService.replaceManyFromSource( - this.createMetadataKeysInstance(updatedInstrument), + createMetadataKeysInstance(this.instrumentModel.collection.name, { + ...existingInstrument.toObject(), + isPublished: true, + }), + createMetadataKeysInstance(this.instrumentModel.collection.name, { + ...updatedInstrument.toObject(), + isPublished: true, + }), ); return updatedInstrument; @@ -150,7 +152,10 @@ export class InstrumentsService { } await this.metadataKeysService.deleteMany( - this.createMetadataKeysInstance(deletedInstrument), + createMetadataKeysInstance( + this.instrumentModel.collection.name, + deletedInstrument, + ), ); return deletedInstrument; diff --git a/src/metadata-keys/dto/create-metadata-key.dto.ts b/src/metadata-keys/dto/create-metadata-key.dto.ts index 4e5ca5b2f..cd71b7365 100644 --- a/src/metadata-keys/dto/create-metadata-key.dto.ts +++ b/src/metadata-keys/dto/create-metadata-key.dto.ts @@ -11,12 +11,4 @@ export class CreateMetadataKeyDto extends UpdateMetadataKeyDto { }) @IsString() sourceType: string; - - @ApiProperty({ - type: String, - required: true, - description: "Unique identifier of the source item this key is linked to.", - }) - @IsString() - sourceId: string; } diff --git a/src/metadata-keys/dto/output-metadata-key.dto.ts b/src/metadata-keys/dto/output-metadata-key.dto.ts index ad561ddc3..bf242f8c6 100644 --- a/src/metadata-keys/dto/output-metadata-key.dto.ts +++ b/src/metadata-keys/dto/output-metadata-key.dto.ts @@ -3,6 +3,7 @@ import { ArrayNotEmpty, IsArray, IsBoolean, + IsNumber, IsOptional, IsString, } from "class-validator"; @@ -56,12 +57,13 @@ export class OutputMetadataKeyDto extends QueryableClass { sourceType: string; @ApiProperty({ - type: String, + type: Number, required: true, - description: "Unique identifier of the source item this key is linked to.", + description: + "Tracks how many sources are using this metadata key. Managed internally.", }) - @IsString() - sourceId: string; + @IsNumber() + usageCount: number; @ApiProperty({ type: Boolean, diff --git a/src/metadata-keys/metadatakeys.service.spec.ts b/src/metadata-keys/metadatakeys.service.spec.ts index 152b96727..812be350a 100644 --- a/src/metadata-keys/metadatakeys.service.spec.ts +++ b/src/metadata-keys/metadatakeys.service.spec.ts @@ -1,141 +1,368 @@ -import { Logger } from "@nestjs/common"; +/* eslint-disable @typescript-eslint/no-explicit-any */ import { getModelToken } from "@nestjs/mongoose"; import { Test, TestingModule } from "@nestjs/testing"; import { MetadataKeysService, MetadataSourceDoc } from "./metadatakeys.service"; import { MetadataKeyClass } from "./schemas/metadatakey.schema"; -class MetadataKeyModelMock { - aggregate = jest.fn().mockReturnValue({ - exec: jest.fn().mockResolvedValue([{ key: "k1" }]), - }); - deleteMany = jest.fn().mockReturnValue({ - exec: jest.fn().mockResolvedValue({ deletedCount: 2 }), - }); - insertMany = jest.fn().mockReturnValue([{ _id: "id1" }]); -} +const modelMock = { + aggregate: jest + .fn() + .mockReturnValue({ exec: jest.fn().mockResolvedValue([]) }), + bulkWrite: jest.fn().mockResolvedValue({}), + updateMany: jest.fn().mockResolvedValue({}), + deleteMany: jest.fn().mockResolvedValue({}), +}; + +const BASE_DOC: MetadataSourceDoc = { + sourceType: "Dataset", + userGroups: ["group-1", "group-2"], + isPublished: false, + metadata: { + temperature: { human_name: "Temperature" }, + pressure: {}, + }, +}; describe("MetadataKeysService", () => { let service: MetadataKeysService; - let model: MetadataKeyModelMock; beforeEach(async () => { + jest.clearAllMocks(); + const module: TestingModule = await Test.createTestingModule({ providers: [ MetadataKeysService, { provide: getModelToken(MetadataKeyClass.name), - useClass: MetadataKeyModelMock, + useValue: modelMock, }, ], }).compile(); service = await module.resolve(MetadataKeysService); - model = module.get( - getModelToken(MetadataKeyClass.name), - ) as unknown as MetadataKeyModelMock; }); it("should be defined", () => { expect(service).toBeDefined(); }); - it("findAll builds aggregation pipeline and executes it", async () => { - const filter = { - where: { sourceType: "dataset" }, - fields: ["key"], - limits: { limit: 10, skip: 0, sort: { createdAt: "asc" } }, - }; + // ------------------------------------------------------------------------- + // findAll + // ------------------------------------------------------------------------- - const accessFilter = { userGroups: { $in: ["ess"] } }; + describe("findAll", () => { + it("builds a pipeline with match, project, sort, skip, limit", async () => { + const accessFilter = { userGroups: { $in: ["group-1"] } }; + const filter = { + where: { sourceType: "Dataset" }, + fields: ["key", "humanReadableName"], + limits: { limit: 10, skip: 5, sort: { createdAt: "asc" } }, + }; - const res = await service.findAll(filter, accessFilter); + await service.findAll(filter, accessFilter); - expect(model.aggregate).toHaveBeenCalledTimes(1); + const [pipeline] = modelMock.aggregate.mock.calls[0]; + expect(pipeline[0]).toEqual({ + $match: { $and: [accessFilter, filter.where] }, + }); + expect(pipeline).toEqual( + expect.arrayContaining([ + expect.objectContaining({ $project: expect.any(Object) }), + expect.objectContaining({ $sort: expect.any(Object) }), + expect.objectContaining({ $skip: 5 }), + expect.objectContaining({ $limit: 10 }), + ]), + ); + }); - const pipeline = model.aggregate.mock.calls[0][0]; - expect(pipeline[0]).toEqual({ - $match: { $and: [accessFilter, filter.where] }, + it("applies default limits when none provided", async () => { + await service.findAll({}, {}); + + const [pipeline] = modelMock.aggregate.mock.calls[0]; + expect(pipeline).toEqual( + expect.arrayContaining([ + expect.objectContaining({ $skip: 0 }), + expect.objectContaining({ $limit: 100 }), + ]), + ); }); - expect(res).toEqual([{ key: "k1" }]); - }); + it("applies default sort when no limits provided", async () => { + await service.findAll({}, {}); + + const [pipeline] = modelMock.aggregate.mock.calls[0]; + expect(pipeline).toEqual( + expect.arrayContaining([ + expect.objectContaining({ $sort: { createdAt: -1 } }), + ]), + ); + }); + + it("applies default sort when limits provided without sort", async () => { + await service.findAll({ limits: { limit: 10, skip: 0 } }, {}); + + const [pipeline] = modelMock.aggregate.mock.calls[0]; + expect(pipeline).toEqual( + expect.arrayContaining([ + expect.objectContaining({ $sort: { createdAt: -1 } }), + ]), + ); + }); + + it("uses provided sort when specified", async () => { + await service.findAll( + { limits: { limit: 10, skip: 0, sort: { key: "asc" } } }, + {}, + ); + + const [pipeline] = modelMock.aggregate.mock.calls[0]; + expect(pipeline).toEqual( + expect.arrayContaining([ + expect.objectContaining({ + $sort: expect.objectContaining({ key: 1 }), + }), + ]), + ); + }); + + it("omits $project stage when no fields specified", async () => { + await service.findAll({}, {}); + + const [pipeline] = modelMock.aggregate.mock.calls[0]; + const hasProject = pipeline.some((s: object) => "$project" in s); + expect(hasProject).toBe(false); + }); + + it("sort stage comes before skip and limit", async () => { + await service.findAll({}, {}); - it("deleteMany deletes and logs result", async () => { - const logSpy = jest.spyOn(Logger, "log").mockImplementation(); + const [pipeline] = modelMock.aggregate.mock.calls[0]; + const sortIndex = pipeline.findIndex((s: object) => "$sort" in s); + const skipIndex = pipeline.findIndex((s: object) => "$skip" in s); + const limitIndex = pipeline.findIndex((s: object) => "$limit" in s); + expect(sortIndex).toBeLessThan(skipIndex); + expect(sortIndex).toBeLessThan(limitIndex); + }); - const filter = { sourceType: "dataset", sourceId: "sid" }; - const result = await service.deleteMany(filter); + it("returns aggregation results", async () => { + const mockData = [{ key: "temperature" }, { key: "pressure" }]; + modelMock.aggregate.mockReturnValueOnce({ + exec: jest.fn().mockResolvedValue(mockData), + }); - expect(model.deleteMany).toHaveBeenCalledWith(filter); - expect(result.deletedCount).toBe(2); - expect(logSpy).toHaveBeenCalled(); + const result = await service.findAll({}, {}); + + expect(result).toEqual(mockData); + }); }); - it("insertManyFromSource does nothing when metadata is empty", async () => { - const doc: MetadataSourceDoc = { - sourceId: "sid", - sourceType: "dataset", - ownerGroup: "ess", - accessGroups: ["ess"], - isPublished: false, - metadata: {}, - }; + // ------------------------------------------------------------------------- + // insertManyFromSource + // ------------------------------------------------------------------------- + + describe("insertManyFromSource", () => { + it("does nothing when metadata is empty", async () => { + await service.insertManyFromSource({ ...BASE_DOC, metadata: {} }); + + expect(modelMock.bulkWrite).not.toHaveBeenCalled(); + }); + + it("calls bulkWrite with one op per metadata key", async () => { + await service.insertManyFromSource(BASE_DOC); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + expect(ops).toHaveLength(2); + }); + + it("builds correct filter from sourceType + key + humanReadableName", async () => { + await service.insertManyFromSource(BASE_DOC); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const filters = ops.map((op: any) => op.updateOne.filter); + expect(filters).toContainEqual({ + sourceType: "Dataset", + key: "temperature", + humanReadableName: "Temperature", + }); + expect(filters).toContainEqual({ + sourceType: "Dataset", + key: "pressure", + humanReadableName: "", + }); + }); + + it("increments usageCount and per-group counts", async () => { + await service.insertManyFromSource(BASE_DOC); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const { update } = ops[0].updateOne; + expect(update.$inc.usageCount).toBe(1); + expect(update.$inc["userGroupCounts.group-1"]).toBe(1); + expect(update.$inc["userGroupCounts.group-2"]).toBe(1); + }); + + it("adds userGroups via $addToSet", async () => { + await service.insertManyFromSource(BASE_DOC); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const { update } = ops[0].updateOne; + expect(update.$addToSet.userGroups.$each).toEqual(["group-1", "group-2"]); + }); + + it("sets isPublished when dataset is published", async () => { + await service.insertManyFromSource({ ...BASE_DOC, isPublished: true }); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const { update } = ops[0].updateOne; + expect(update.$max.isPublished).toBe(true); + }); + + it("does not set isPublished when dataset is not published", async () => { + await service.insertManyFromSource({ ...BASE_DOC, isPublished: false }); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const { update } = ops[0].updateOne; + expect(update.$max?.isPublished).toBeFalsy(); + }); - const res = await service.insertManyFromSource(doc); + it("sets upsert: true", async () => { + await service.insertManyFromSource(BASE_DOC); - expect(res).toBeUndefined(); - expect(model.insertMany).not.toHaveBeenCalled(); + const [ops] = modelMock.bulkWrite.mock.calls[0]; + expect(ops[0].updateOne.upsert).toBe(true); + }); }); - it("insertManyFromSource creates metadata keys", async () => { - const doc: MetadataSourceDoc = { - sourceId: "sid", - sourceType: "dataset", - ownerGroup: "ess", - accessGroups: ["swap", "ess"], - isPublished: false, - metadata: { - key1: { human_name: "Key One" }, - key2: {}, - }, - }; + // ------------------------------------------------------------------------- + // deleteMany + // ------------------------------------------------------------------------- + + describe("deleteMany", () => { + it("does nothing when metadata is empty", async () => { + await service.deleteMany({ ...BASE_DOC, metadata: {} }); + + expect(modelMock.bulkWrite).not.toHaveBeenCalled(); + expect(modelMock.updateMany).not.toHaveBeenCalled(); + expect(modelMock.deleteMany).not.toHaveBeenCalled(); + }); + + it("runs operations in order: decrement → recompute → delete", async () => { + const callOrder: string[] = []; + modelMock.bulkWrite.mockImplementation(() => { + callOrder.push("bulkWrite"); + return Promise.resolve({}); + }); + modelMock.updateMany.mockImplementation(() => { + callOrder.push("updateMany"); + return Promise.resolve({}); + }); + modelMock.deleteMany.mockImplementation(() => { + callOrder.push("deleteMany"); + return Promise.resolve({}); + }); + + await service.deleteMany(BASE_DOC); + + expect(callOrder).toEqual(["bulkWrite", "updateMany", "deleteMany"]); + }); + + it("decrements usageCount and per-group counts", async () => { + await service.deleteMany(BASE_DOC); - const res = await service.insertManyFromSource(doc); + const [ops] = modelMock.bulkWrite.mock.calls[0]; + const { update } = ops[0].updateOne; + expect(update.$inc.usageCount).toBe(-1); + expect(update.$inc["userGroupCounts.group-1"]).toBe(-1); + expect(update.$inc["userGroupCounts.group-2"]).toBe(-1); + }); - expect(model.insertMany).toHaveBeenCalledTimes(1); + it("targets correct filter based on metadata keys and humanReadableName", async () => { + await service.deleteMany(BASE_DOC); - const insertedDocs = model.insertMany.mock.calls[0][0]; - expect(insertedDocs).toHaveLength(2); + const [firstFilter] = modelMock.updateMany.mock.calls[0]; + expect(firstFilter.$or).toContainEqual({ + sourceType: "Dataset", + key: "temperature", + humanReadableName: "Temperature", + }); + expect(firstFilter.$or).toContainEqual({ + sourceType: "Dataset", + key: "pressure", + humanReadableName: "", + }); + }); - expect(insertedDocs[0]).toMatchObject({ - key: "key1", - humanReadableName: "Key One", - sourceType: "dataset", - sourceId: "sid", + it("deletes documents where usageCount <= 0", async () => { + await service.deleteMany(BASE_DOC); + + const [deleteFilter] = modelMock.deleteMany.mock.calls[0]; + expect(deleteFilter.$and[1].usageCount).toEqual({ $lte: 0 }); }); - expect(res).toEqual([{ _id: "id1" }]); + it("recompute stage uses $set with userGroups defined", async () => { + await service.deleteMany(BASE_DOC); + + const [, recomputeStage] = modelMock.updateMany.mock.calls[0]; + expect(recomputeStage[1].$set.userGroups).toBeDefined(); + }); + + it("does not set upsert on decrement ops", async () => { + await service.deleteMany(BASE_DOC); + + const [ops] = modelMock.bulkWrite.mock.calls[0]; + expect(ops[0].updateOne.upsert).toBe(false); + }); }); - it("replaceManyFromSource deletes then inserts", async () => { - const doc: MetadataSourceDoc = { - sourceId: "sid", - sourceType: "dataset", - ownerGroup: "ess", - accessGroups: ["swap"], - isPublished: false, - metadata: { key1: {} }, - }; + // ------------------------------------------------------------------------- + // replaceManyFromSource + // ------------------------------------------------------------------------- + + describe("replaceManyFromSource", () => { + it("calls deleteMany with oldDoc then insertManyFromSource with newDoc", async () => { + const deleteSpy = jest.spyOn(service, "deleteMany").mockResolvedValue(); + const insertSpy = jest + .spyOn(service, "insertManyFromSource") + .mockResolvedValue(); + + await service.replaceManyFromSource(BASE_DOC, BASE_DOC); + + expect(deleteSpy).toHaveBeenCalledWith(BASE_DOC); + expect(insertSpy).toHaveBeenCalledWith(BASE_DOC); + }); + + it("calls deleteMany before insertManyFromSource", async () => { + const callOrder: string[] = []; + jest.spyOn(service, "deleteMany").mockImplementation(async () => { + callOrder.push("deleteMany"); + }); + jest + .spyOn(service, "insertManyFromSource") + .mockImplementation(async () => { + callOrder.push("insertManyFromSource"); + }); + + await service.replaceManyFromSource(BASE_DOC, BASE_DOC); + + expect(callOrder).toEqual(["deleteMany", "insertManyFromSource"]); + }); - const deleteSpy = jest.spyOn(service, "deleteMany"); - const insertSpy = jest.spyOn(service, "insertManyFromSource"); + it("net usageCount is zero for unchanged keys", async () => { + const doc = { + ...BASE_DOC, + metadata: { temperature: { human_name: "Temperature" } }, + }; - await service.replaceManyFromSource(doc); + await service.replaceManyFromSource(doc, doc); - expect(deleteSpy).toHaveBeenCalledWith({ - sourceId: "sid", - sourceType: "dataset", + const allOps = modelMock.bulkWrite.mock.calls.flatMap( + ([ops]: any) => ops, + ); + const totalUsageCountDelta = allOps.reduce( + (sum: number, op: any) => sum + op.updateOne.update.$inc.usageCount, + 0, + ); + expect(totalUsageCountDelta).toBe(0); }); - expect(insertSpy).toHaveBeenCalledWith(doc); }); }); diff --git a/src/metadata-keys/metadatakeys.service.ts b/src/metadata-keys/metadatakeys.service.ts index fc1b06076..ce2da4f5d 100644 --- a/src/metadata-keys/metadatakeys.service.ts +++ b/src/metadata-keys/metadatakeys.service.ts @@ -4,14 +4,7 @@ import { MetadataKeyClass, MetadataKeyDocument, } from "./schemas/metadatakey.schema"; -import { - DeleteResult, - FilterQuery, - HydratedDocument, - Model, - PipelineStage, - QueryOptions, -} from "mongoose"; +import { FilterQuery, Model, PipelineStage, QueryOptions } from "mongoose"; import { isEmpty } from "lodash"; import { addCreatedByFields, @@ -24,14 +17,40 @@ type ScientificMetadataEntry = { }; export type MetadataSourceDoc = { - sourceId: string; sourceType: string; - ownerGroup: string; - accessGroups: string[]; + userGroups: string[]; isPublished: boolean; metadata: Record; }; +// Recomputes the userGroups string array from userGroupCounts after a decrement. +// Retains only group names whose reference count is still above zero. +const RECOMPUTE_USER_GROUPS_STAGE = [ + { + $set: { + userGroupCounts: { + $arrayToObject: { + $filter: { + input: { $objectToArray: "$userGroupCounts" }, + cond: { $gt: ["$$this.v", 0] }, + }, + }, + }, + }, + }, + { + $set: { + userGroups: { + $map: { + input: { $objectToArray: "$userGroupCounts" }, + as: "entry", + in: "$$entry.k", + }, + }, + }, + }, +]; + @Injectable({ scope: Scope.REQUEST }) export class MetadataKeysService { constructor( @@ -45,10 +64,11 @@ export class MetadataKeysService { ): Promise { const whereFilter: FilterQuery = filter.where ?? {}; const fieldsProjection: string[] = filter.fields ?? {}; - const limits: QueryOptions = filter.limits ?? { - limit: 100, - skip: 0, - sort: { createdAt: "desc" }, + + const limits: QueryOptions = { + limit: filter.limits?.limit ?? 100, + skip: filter.limits?.skip ?? 0, + sort: filter.limits?.sort ?? { createdAt: "desc" }, }; const pipeline: PipelineStage[] = [ @@ -58,6 +78,7 @@ export class MetadataKeysService { }, }, ]; + if (!isEmpty(fieldsProjection)) { const projection = parsePipelineProjection(fieldsProjection); pipeline.push({ $project: projection }); @@ -79,59 +100,85 @@ export class MetadataKeysService { return data; } - async deleteMany( - filter: FilterQuery, - ): Promise { - const result = await this.metadataKeyModel.deleteMany(filter).exec(); + async insertManyFromSource(doc: MetadataSourceDoc): Promise { + await this.adjustCounts(doc, 1); + } - Logger.log( - `MetadataKeys deleted: ${result.deletedCount ?? 0} With filter:`, - { filter }, - ); + async deleteMany(doc: MetadataSourceDoc): Promise { + await this.adjustCounts(doc, -1); + } - return result; + async replaceManyFromSource( + oldDoc: MetadataSourceDoc, + newDoc: MetadataSourceDoc, + ): Promise { + await this.deleteMany(oldDoc); + await this.insertManyFromSource(newDoc); } - async insertManyFromSource( + private async adjustCounts( doc: MetadataSourceDoc, - ): Promise[] | void> { - if (isEmpty(doc.metadata)) { - return; - } - const userGroups = Array.from( - new Set([doc.ownerGroup, ...(doc.accessGroups ?? [])].filter(Boolean)), - ) as string[]; - - const isPublished = doc.isPublished; - - const metadata = doc.metadata ?? {}; - - const docs = Object.entries(metadata).map(([key, entry]) => { - const createMetadataKeyDto = { - _id: `${doc.sourceType}_${doc.sourceId}_${key}`, - id: `${doc.sourceType}_${doc.sourceId}_${key}`, - sourceType: doc.sourceType, - sourceId: doc.sourceId, - key, - userGroups, - isPublished, - humanReadableName: (entry as ScientificMetadataEntry).human_name ?? "", - }; - return addCreatedByFields(createMetadataKeyDto, "system"); + delta: 1 | -1, + ): Promise { + if (isEmpty(doc.metadata)) return; + + const { sourceType, userGroups, isPublished, metadata } = doc; + + const filters = Object.entries(metadata).map(([key, entry]) => { + const humanReadableName = + (entry as ScientificMetadataEntry).human_name ?? ""; + return { sourceType, key, humanReadableName }; }); + const queryFilter = { $or: filters }; + + const ops = filters.map(({ sourceType, key, humanReadableName }) => ({ + updateOne: { + filter: { sourceType, key, humanReadableName }, + update: { + $set: { + updatedAt: new Date(), + }, + $inc: { + usageCount: delta, + ...this.groupCountDeltas(userGroups, delta), + }, + ...(delta === 1 && { + $max: { isPublished }, + $addToSet: { userGroups: { $each: userGroups } }, + $setOnInsert: addCreatedByFields({}, "system"), + }), + }, + upsert: delta === 1, + }, + })); + + await this.metadataKeyModel.bulkWrite(ops); + + if (delta === -1) { + // UpdateMany is necessary here because the bulkWrite above only decrements userGroupCounts + // but cannot remove zero-count groups from userGroups in the same operation. + await this.metadataKeyModel.updateMany( + queryFilter, + RECOMPUTE_USER_GROUPS_STAGE, + ); + + await this.metadataKeyModel.deleteMany({ + $and: [queryFilter, { usageCount: { $lte: 0 } }], + }); + } + Logger.log( - `Created ${docs.length} MetadataKeys from source ${doc.sourceType} with ID ${doc.sourceId}`, + `${delta === 1 ? "Upserted" : "Decremented or deleted"} MetadataKeys for ${sourceType}: ${Object.keys(metadata).join(", ")}`, ); - - return await this.metadataKeyModel.insertMany(docs); } - async replaceManyFromSource(doc: MetadataSourceDoc): Promise { - await this.deleteMany({ - sourceId: doc.sourceId, - sourceType: doc.sourceType, - }); - await this.insertManyFromSource(doc); + private groupCountDeltas( + groups: string[], + delta: 1 | -1, + ): Record { + return Object.fromEntries( + groups.map((g) => [`userGroupCounts.${g}`, delta]), + ); } } diff --git a/src/metadata-keys/schemas/metadatakey.schema.ts b/src/metadata-keys/schemas/metadatakey.schema.ts index b4c6b6d39..bbf602d32 100644 --- a/src/metadata-keys/schemas/metadatakey.schema.ts +++ b/src/metadata-keys/schemas/metadatakey.schema.ts @@ -76,28 +76,40 @@ export class MetadataKeyClass extends QueryableClass { sourceType: string; @ApiProperty({ - type: String, + type: Boolean, required: true, - description: "Unique identifier of the source item this key is linked to.", + description: "Flag is true when data are made publicly available.", }) - @Prop({ - type: String, - required: true, - index: true, + @Prop({ type: Boolean, required: true, default: false }) + isPublished: boolean; + + @ApiProperty({ + type: Number, + description: + "Tracks how many sources are using this metadata key. Managed internally.", }) - sourceId: string; + @Prop({ type: Number, default: 0 }) + usageCount: number; @ApiProperty({ - type: Boolean, - required: true, - description: "Flag is true when data are made publicly available.", + type: Object, + description: + "Tracks how many datasets per user group reference this metadata key. " + + "Used to safely remove groups from userGroups when the last dataset " + + "contributing that group is deleted or updated. " + + "e.g. { 'groupA': 3, 'groupB': 1 } means 3 datasets with groupA and 1 with groupB use this key.", }) - @Prop({ type: Boolean, required: true }) - isPublished: boolean; + @Prop({ type: Map, of: Number, default: {} }) + userGroupCounts: Map; } export const MetadataKeySchema = SchemaFactory.createForClass(MetadataKeyClass); -MetadataKeySchema.index({ sourceType: 1, sourceId: 1 }); +MetadataKeySchema.index({ sourceType: 1, isPublished: 1, key: 1 }); +MetadataKeySchema.index({ + sourceType: 1, + isPublished: 1, + humanReadableName: 1, +}); MetadataKeySchema.index({ sourceType: 1, userGroups: 1, key: 1 }); MetadataKeySchema.index({ sourceType: 1, userGroups: 1, humanReadableName: 1 }); diff --git a/src/metadata-keys/types/metadatakeys-filter-content.ts b/src/metadata-keys/types/metadatakeys-filter-content.ts index b7a38afab..e9804e0b5 100644 --- a/src/metadata-keys/types/metadatakeys-filter-content.ts +++ b/src/metadata-keys/types/metadatakeys-filter-content.ts @@ -9,8 +9,8 @@ const FILTERS: Record<"limits" | "fields" | "where" | "include", object> = { type: "object", example: { sourceType: "dataset", - sourceId: "datasetId", key: "metadata_key_name", + humanReadableName: "Metadata Key Name", }, }, include: {}, diff --git a/src/proposals/proposals.service.ts b/src/proposals/proposals.service.ts index ec1f705c8..6dcdf02bd 100644 --- a/src/proposals/proposals.service.ts +++ b/src/proposals/proposals.service.ts @@ -9,13 +9,7 @@ import { import { REQUEST } from "@nestjs/core"; import { Request } from "express"; import { InjectModel } from "@nestjs/mongoose"; -import { - FilterQuery, - Model, - PipelineStage, - QueryOptions, - UpdateQuery, -} from "mongoose"; +import { FilterQuery, Model, PipelineStage, QueryOptions } from "mongoose"; import { IFacets, IFilters } from "src/common/interfaces/common.interface"; import { createFullfacetPipeline, @@ -26,6 +20,7 @@ import { parsePipelineSort, addCreatedByFields, addUpdatedByField, + createMetadataKeysInstance, } from "src/common/utils"; import { isEmpty } from "lodash"; import { @@ -43,10 +38,7 @@ import { IProposalFields } from "./interfaces/proposal-filters.interface"; import { ProposalClass, ProposalDocument } from "./schemas/proposal.schema"; import { JWTUser } from "src/auth/interfaces/jwt-user.interface"; import { CreateMeasurementPeriodDto } from "./dto/create-measurement-period.dto"; -import { - MetadataKeysService, - MetadataSourceDoc, -} from "src/metadata-keys/metadatakeys.service"; +import { MetadataKeysService } from "src/metadata-keys/metadatakeys.service"; import { withOCCFilter } from "src/datasets/utils/occ-util"; @Injectable({ scope: Scope.REQUEST }) @@ -177,20 +169,6 @@ export class ProposalsService { } } - private createMetadataKeysInstance( - doc: UpdateQuery, - ): MetadataSourceDoc { - const source: MetadataSourceDoc = { - sourceType: "proposal", - sourceId: doc.proposalId, - ownerGroup: doc.ownerGroup, - accessGroups: doc.accessGroups || [], - isPublished: doc.isPublished || false, - metadata: doc.metadata ?? {}, - }; - return source; - } - async create(createProposalDto: CreateProposalDto): Promise { const username = (this.request.user as JWTUser).username; if (createProposalDto.MeasurementPeriodList) { @@ -208,7 +186,10 @@ export class ProposalsService { const savedProposal = await createdProposal.save(); this.metadataKeysService.insertManyFromSource( - this.createMetadataKeysInstance(savedProposal), + createMetadataKeysInstance( + this.proposalModel.collection.name, + savedProposal, + ), ); return savedProposal; } @@ -289,6 +270,13 @@ export class ProposalsService { unmodifiedSince?: Date, ): Promise { const username = (this.request.user as JWTUser).username; + const existingProposal = await this.proposalModel.findOne(filter).exec(); + + if (!existingProposal) { + throw new NotFoundException( + `Proposal not found with filter: ${JSON.stringify(filter)}`, + ); + } const filterQuery = withOCCFilter(filter, unmodifiedSince); @@ -319,7 +307,14 @@ export class ProposalsService { } await this.metadataKeysService.replaceManyFromSource( - this.createMetadataKeysInstance(updatedProposal), + createMetadataKeysInstance( + this.proposalModel.collection.name, + existingProposal, + ), + createMetadataKeysInstance( + this.proposalModel.collection.name, + updatedProposal, + ), ); return updatedProposal; @@ -336,10 +331,12 @@ export class ProposalsService { ); } - this.metadataKeysService.deleteMany({ - sourceType: "proposal", - sourceId: deletedProposal.proposalId, - }); + await this.metadataKeysService.deleteMany( + createMetadataKeysInstance( + this.proposalModel.collection.name, + deletedProposal, + ), + ); return deletedProposal; } diff --git a/src/samples/samples.service.ts b/src/samples/samples.service.ts index f212ef2c9..4bc6d6580 100644 --- a/src/samples/samples.service.ts +++ b/src/samples/samples.service.ts @@ -9,7 +9,7 @@ import { ConfigService } from "@nestjs/config"; import { REQUEST } from "@nestjs/core"; import { Request } from "express"; import { InjectModel } from "@nestjs/mongoose"; -import { FilterQuery, Model, QueryOptions, UpdateQuery } from "mongoose"; +import { FilterQuery, Model, QueryOptions } from "mongoose"; import { JWTUser } from "src/auth/interfaces/jwt-user.interface"; import { IFilters } from "src/common/interfaces/common.interface"; import { @@ -19,6 +19,7 @@ import { extractMetadataKeys, parseLimitFilters, decodeMetadataKeyStrings, + createMetadataKeysInstance, } from "src/common/utils"; import { CreateSampleDto } from "./dto/create-sample.dto"; import { PartialUpdateSampleDto } from "./dto/update-sample.dto"; @@ -26,10 +27,7 @@ import { ISampleFields } from "./interfaces/sample-filters.interface"; import { SampleClass, SampleDocument } from "./schemas/sample.schema"; import { CountApiResponse } from "src/common/types"; import { OutputSampleDto } from "./dto/output-sample.dto"; -import { - MetadataKeysService, - MetadataSourceDoc, -} from "src/metadata-keys/metadatakeys.service"; +import { MetadataKeysService } from "src/metadata-keys/metadatakeys.service"; import { withOCCFilter } from "src/datasets/utils/occ-util"; @Injectable({ scope: Scope.REQUEST }) @@ -41,20 +39,6 @@ export class SamplesService { @Inject(REQUEST) private request: Request, ) {} - private createMetadataKeysInstance( - doc: UpdateQuery, - ): MetadataSourceDoc { - const source: MetadataSourceDoc = { - sourceType: "sample", - sourceId: doc.sampleId, - ownerGroup: doc.ownerGroup, - accessGroups: doc.accessGroups || [], - isPublished: doc.isPublished || false, - metadata: doc.sampleCharacteristics ?? {}, - }; - return source; - } - async create(createSampleDto: CreateSampleDto): Promise { const username = (this.request.user as JWTUser).username; const createdSample = new this.sampleModel( @@ -63,7 +47,7 @@ export class SamplesService { const savedSample = await createdSample.save(); this.metadataKeysService.insertManyFromSource( - this.createMetadataKeysInstance(savedSample), + createMetadataKeysInstance(this.sampleModel.collection.name, savedSample), ); return savedSample; @@ -185,6 +169,14 @@ export class SamplesService { unmodifiedSince?: Date, ): Promise { const username = (this.request.user as JWTUser).username; + const existingSample = await this.sampleModel.findOne(filter).exec(); + + if (!existingSample) { + throw new NotFoundException( + `Sample not found with filter: ${JSON.stringify(filter)}`, + ); + } + const updateData = addUpdatedByField(updateSampleDto, username); const updateDataMongoose = { @@ -215,7 +207,14 @@ export class SamplesService { } await this.metadataKeysService.replaceManyFromSource( - this.createMetadataKeysInstance(updatedSample), + createMetadataKeysInstance( + this.sampleModel.collection.name, + existingSample, + ), + createMetadataKeysInstance( + this.sampleModel.collection.name, + updatedSample, + ), ); return updatedSample; @@ -232,10 +231,12 @@ export class SamplesService { ); } - this.metadataKeysService.deleteMany({ - sourceType: "sample", - sourceId: deletedSample.sampleId, - }); + this.metadataKeysService.deleteMany( + createMetadataKeysInstance( + this.sampleModel.collection.name, + deletedSample, + ), + ); return deletedSample; } } diff --git a/test/AttachmentV4.js b/test/AttachmentV4.js index 33980efab..0cd93da1a 100644 --- a/test/AttachmentV4.js +++ b/test/AttachmentV4.js @@ -3,7 +3,6 @@ const assert = require("node:assert"); const utils = require("./LoginUtils"); const { TestData } = require("./TestData"); const { v4: uuidv4 } = require("uuid"); -const request = require("supertest"); let accessTokenAdminIngestor = null, accessTokenUser1 = null, diff --git a/test/DatasetV4.js b/test/DatasetV4.js index 28ef2000c..76ef9089d 100644 --- a/test/DatasetV4.js +++ b/test/DatasetV4.js @@ -1332,10 +1332,12 @@ describe("2500: Datasets v4 tests", () => { unit: "mg", valueSI: 0.0006, unitSI: "kg", + human_name: "Pressure SI", }); res.body.scientificMetadata.with_number.should.deep.eq({ value: 111, unit: "", + human_name: "Sample Number", }); res.body.datasetlifecycle.should.have .property("storageLocation") @@ -1349,6 +1351,7 @@ describe("2500: Datasets v4 tests", () => { with_unit_and_value_si: { value: -2, unit: "km", + human_name: "new human name", }, }, }; @@ -1370,6 +1373,7 @@ describe("2500: Datasets v4 tests", () => { unit: "km", valueSI: -2000, unitSI: "m", + human_name: "new human name", }); }); }); @@ -1382,6 +1386,7 @@ describe("2500: Datasets v4 tests", () => { unit: "cm", valueSI: null, unitSI: null, + human_name: "new human name", }, with_number: null, }, @@ -1408,6 +1413,7 @@ describe("2500: Datasets v4 tests", () => { unit: "cm", valueSI: -0.02, unitSI: "m", + human_name: "new human name", }); res.body.scientificMetadata.should.not.have.property("with_number"); }); @@ -1591,6 +1597,7 @@ describe("2500: Datasets v4 tests", () => { unit: "cm", valueSI: 555, unitSI: "cmcm", + human_name: "new human name", }, }, }; @@ -1611,6 +1618,7 @@ describe("2500: Datasets v4 tests", () => { unit: "cm", valueSI: 0.22, unitSI: "m", + human_name: "new human name", }); }); }); diff --git a/test/MetadataKeys.js b/test/MetadataKeys.js index c62e117f0..e8e7ef0f7 100644 --- a/test/MetadataKeys.js +++ b/test/MetadataKeys.js @@ -2,173 +2,586 @@ const utils = require("./LoginUtils"); const { TestData } = require("./TestData"); -let accessTokenAdminIngestor = null; +let accessTokenAdmin = null; +let accessTokenArchiveManager = null; let accessTokenUser1 = null; -let datasetIdPrivate = null; -let datasetIdUser1 = null; -describe("MetadataKeys v4 ACL", () => { +let pidGroup1 = null; // ownerGroup: group1, not published +let pidGroup2 = null; // ownerGroup: group2, not published +let pidPublished = null; // ownerGroup: group2, isPublished: true + +// --------------------------------------------------------------------------- +// Dataset fixtures — controlled ownerGroup for predictable access control +// --------------------------------------------------------------------------- +const datasetGroup1 = { + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, +}; + +const datasetGroup2 = { + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group2", + accessGroups: ["group2"], + isPublished: false, + // extra key exclusive to group2 so access control tests are unambiguous + scientificMetadata: { + ...TestData.DatasetWithScientificMetadataV4.scientificMetadata, + group2_only_key: { value: "exclusive to group2" }, + }, +}; + +const datasetPublished = { + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group2", + accessGroups: ["group2"], + isPublished: true, +}; + +// --------------------------------------------------------------------------- +// Helpers +// --------------------------------------------------------------------------- +async function getMetadataKeys(filter, token) { + const req = request(appUrl) + .get( + `/api/v4/metadatakeys?filter=${encodeURIComponent(JSON.stringify(filter))}`, + ) + .set("Accept", "application/json"); + + if (token) { + req.set({ Authorization: `Bearer ${token}` }); + } + + return req + .expect(TestData.SuccessfulGetStatusCode) + .expect("Content-Type", /json/); +} + +async function createDataset(dataset) { + const res = await request(appUrl) + .post("/api/v4/datasets") + .send(dataset) + .set("Accept", "application/json") + .set({ Authorization: `Bearer ${accessTokenAdmin}` }) + .expect(TestData.EntryCreatedStatusCode); + return res.body.pid; +} + +async function patchDataset(pid, payload) { + return request(appUrl) + .patch("/api/v4/datasets/" + encodeURIComponent(pid)) + .send(payload) + .set("Accept", "application/json") + .set({ Authorization: `Bearer ${accessTokenAdmin}` }) + .expect(TestData.SuccessfulGetStatusCode); +} + +async function deleteDataset(pid) { + return request(appUrl) + .delete("/api/v4/datasets/" + encodeURIComponent(pid)) + .set("Accept", "application/json") + .set({ Authorization: `Bearer ${accessTokenArchiveManager}` }) + .expect(TestData.SuccessfulDeleteStatusCode); +} + +describe("2000: MetadataKeys: Access Control and Search", () => { before(async () => { - db.collection("MetadataKeys").deleteMany({}); + await db.collection("Dataset").deleteMany({}); + await db.collection("MetadataKeys").deleteMany({}); - accessTokenAdminIngestor = await utils.getToken(appUrl, { + accessTokenAdmin = await utils.getToken(appUrl, { username: "adminIngestor", password: TestData.Accounts["adminIngestor"]["password"], }); + accessTokenArchiveManager = await utils.getToken(appUrl, { + username: "archiveManager", + password: TestData.Accounts["archiveManager"]["password"], + }); + accessTokenUser1 = await utils.getToken(appUrl, { username: "user1", password: TestData.Accounts["user1"]["password"], }); + + pidGroup1 = await createDataset(datasetGroup1); + pidGroup2 = await createDataset(datasetGroup2); + pidPublished = await createDataset(datasetPublished); }); - it("0000: create a private dataset v3 has scientific metadata for admin", async () => { - return request(appUrl) - .post("/api/v3/Datasets") - .send(TestData.RawCorrectRandom) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenAdminIngestor}` }) - .expect("Content-Type", /json/) - .expect(TestData.EntryCreatedStatusCode) - .then((res) => { - res.body.should.have - .property("datasetName") - .and.equal(TestData.RawCorrectRandom.datasetName); - - datasetIdPrivate = encodeURIComponent(res.body["pid"]); - }); - }); - - it("0001: create a public dataset v4 has scientific metadata for unauthenticated user", async () => { - const publicDataset = { ...TestData.RawCorrectV4, isPublished: true }; - return request(appUrl) - .post("/api/v4/Datasets") - .send(publicDataset) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenAdminIngestor}` }) - .expect("Content-Type", /json/) - .expect(TestData.EntryCreatedStatusCode) - .then((res) => { - res.body.should.have - .property("datasetName") - .and.equal(publicDataset.datasetName); - }); - }); - - it("0002: create a private dataset v4 has scientific metadata for user1", async () => { - const user1Dataset = { ...TestData.RawCorrectV4, accessGroups: ["group1"] }; - return request(appUrl) - .post("/api/v4/Datasets") - .send(user1Dataset) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenAdminIngestor}` }) - .expect("Content-Type", /json/) - .expect(TestData.EntryCreatedStatusCode) - .then((res) => { - res.body.should.have - .property("datasetName") - .and.equal(user1Dataset.datasetName); - - datasetIdUser1 = encodeURIComponent(res.body["pid"]); - }); - }); - - it("0010: should allow admin to list all metadata keys", async () => { - const filter = { - limits: { limit: 10, skip: 0, sort: { createdAt: "desc" } }, - }; + after(async () => { + for (const pid of [pidGroup1, pidGroup2, pidPublished]) { + if (pid) await deleteDataset(pid); + } + }); + + // ------------------------------------------------------------------------- + // Basic access + // ------------------------------------------------------------------------- + + it("0100: admin can fetch metadata keys and gets results from all groups", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset" } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + + const allGroups = res.body.flatMap((k) => k.userGroups); + allGroups.should.include("group1"); + allGroups.should.include("group2"); + }); + + it("0110: user1 can fetch metadata keys and only sees group1 keys", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset" } }, + accessTokenUser1, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body.forEach((k) => { + k.userGroups.should.include("group1"); + }); + }); + + it("0120: user1 cannot see keys exclusive to group2", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset", key: "group2_only_key" } }, + accessTokenUser1, + ); + + res.body.should.be.an("array").and.have.lengthOf(0); + }); + + it("0130: unauthenticated user can only see published metadata keys", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset" } }, + null, + ); + + res.body.should.be.an("array"); + res.body.forEach((k) => { + k.isPublished.should.equal(true); + }); + }); + + it("0140: unauthenticated user cannot see unpublished keys", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset", isPublished: false } }, + null, + ); + + res.body.should.be.an("array").and.have.lengthOf(0); + }); + + // ------------------------------------------------------------------------- + // Search — key + // ------------------------------------------------------------------------- + + it("0200: admin can search keys by exact key name", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset", key: "with_number" } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array"); + res.body.forEach((k) => k.key.should.equal("with_number")); + }); + + it("0210: admin can search keys by partial key name using $regex", async () => { + const res = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: { $regex: "with_", $options: "i" }, + }, + }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body.forEach((k) => k.key.should.match(/with_/i)); + }); + + it("0220: regex search returns no results for non-matching pattern", async () => { + const res = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: { $regex: "nonexistent_xyz", $options: "i" }, + }, + }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.lengthOf(0); + }); + + it("0230: user1 regex search only returns keys from accessible groups", async () => { + const res = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: { $regex: "with_", $options: "i" }, + }, + }, + accessTokenUser1, + ); + + res.body.should.be.an("array"); + res.body.forEach((k) => k.userGroups.should.include("group1")); + }); + + // ------------------------------------------------------------------------- + // Search — humanReadableName + // ------------------------------------------------------------------------- + + it("0240: admin can search by humanReadableName using $regex", async () => { + const res = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + humanReadableName: { $regex: "pressure", $options: "i" }, + }, + }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body.forEach((k) => k.humanReadableName.should.match(/pressure/i)); + }); + + it("0250: humanReadableName regex search returns empty for no match", async () => { + const res = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + humanReadableName: { $regex: "nonexistent_xyz", $options: "i" }, + }, + }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.lengthOf(0); + }); + + it("0260: keys without human_name have empty humanReadableName", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset", key: "with_key_value" } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body[0].humanReadableName.should.equal(""); + }); + + // ------------------------------------------------------------------------- + // Pagination + // ------------------------------------------------------------------------- + + it("0300: limit is respected", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset" }, limits: { limit: 2, skip: 0 } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.at.most(2); + }); + + it("0310: skip is respected", async () => { + const resAll = await getMetadataKeys( + { where: { sourceType: "Dataset" }, limits: { limit: 100, skip: 0 } }, + accessTokenAdmin, + ); + const resSkipped = await getMetadataKeys( + { where: { sourceType: "Dataset" }, limits: { limit: 100, skip: 1 } }, + accessTokenAdmin, + ); + + resSkipped.body.length.should.equal(resAll.body.length - 1); + }); + + // ------------------------------------------------------------------------- + // Fields projection + // ------------------------------------------------------------------------- - return request(appUrl) - .get( - `/api/v4/metadatakeys?filter=${encodeURIComponent(JSON.stringify(filter))}`, - ) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenAdminIngestor}` }) - .expect(TestData.SuccessfulGetStatusCode) - .expect("Content-Type", /json/) - .then((res) => { - res.body.should.be.an("array").that.satisfies((arr) => { - const values = arr.map((item) => item.isPublished); - return values.includes(true) && values.includes(false); - }); - }); - }); - - it("0020: should allow unauthenticated user to list only published metadata keys", async () => { - const filter = { - where: { sourceType: "dataset" }, - limits: { limit: 10, skip: 0, sort: { createdAt: "desc" } }, + it("0400: fields projection returns only requested fields", async () => { + const res = await getMetadataKeys( + { + where: { sourceType: "Dataset" }, + fields: ["key", "sourceType"], + limits: { limit: 5, skip: 0 }, + }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body.forEach((k) => { + k.should.have.property("key"); + k.should.have.property("sourceType"); + k.should.not.have.property("userGroups"); + k.should.not.have.property("userGroupCounts"); + k.should.not.have.property("usageCount"); + }); + }); + + // ------------------------------------------------------------------------- + // Response shape + // ------------------------------------------------------------------------- + + it("0500: response documents have expected shape", async () => { + const res = await getMetadataKeys( + { where: { sourceType: "Dataset" }, limits: { limit: 1, skip: 0 } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + + const doc = res.body[0]; + doc.should.have.property("key").and.be.a("string"); + doc.should.have.property("sourceType").and.equal("Dataset"); + doc.should.have.property("humanReadableName").and.be.a("string"); + doc.should.have.property("userGroups").and.be.an("array"); + doc.should.have.property("isPublished").and.be.a("boolean"); + doc.should.have.property("usageCount").and.be.a("number"); + }); + + it("0510: usageCount reflects number of datasets using the key", async () => { + // with_number exists in all 3 datasets created in before() + const res = await getMetadataKeys( + { where: { sourceType: "Dataset", key: "with_number" } }, + accessTokenAdmin, + ); + + res.body.should.be.an("array").and.have.length.greaterThan(0); + res.body[0].usageCount.should.equal(3); + }); + + // ------------------------------------------------------------------------- + // Mutation tests + // ------------------------------------------------------------------------- + + it("0600: updating human_name renames the MetadataKey (delete old + insert new)", async () => { + const pid = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { + rename_test_key: { value: 1, human_name: "Original Name" }, + }, + }); + + // Old key exists + const before = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: "rename_test_key", + humanReadableName: "Original Name", + }, + }, + accessTokenAdmin, + ); + before.body.should.have.lengthOf(1); + + await patchDataset(pid, { + scientificMetadata: { + rename_test_key: { value: 1, human_name: "Renamed Name" }, + }, + }); + + // Old key gone + const afterOld = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: "rename_test_key", + humanReadableName: "Original Name", + }, + }, + accessTokenAdmin, + ); + afterOld.body.should.have.lengthOf(0); + + // New key exists + const afterNew = await getMetadataKeys( + { + where: { + sourceType: "Dataset", + key: "rename_test_key", + humanReadableName: "Renamed Name", + }, + }, + accessTokenAdmin, + ); + afterNew.body.should.have.lengthOf(1); + afterNew.body[0].usageCount.should.equal(1); + + await deleteDataset(pid); + }); + + it("0610: removing a key from scientificMetadata decrements usageCount", async () => { + const pidA = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { shared_key: { value: 1 } }, + }); + + const pidB = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { shared_key: { value: 2 } }, + }); + + const filterKey = { where: { sourceType: "Dataset", key: "shared_key" } }; + + // usageCount starts at 2 + const before = await getMetadataKeys(filterKey, accessTokenAdmin); + before.body.should.have.lengthOf(1); + before.body[0].usageCount.should.equal(2); + + // Remove key from datasetA + await patchDataset(pidA, { scientificMetadata: {} }); + + // usageCount drops to 1, key still exists + const after = await getMetadataKeys(filterKey, accessTokenAdmin); + after.body.should.have.lengthOf(1); + after.body[0].usageCount.should.equal(1); + + await deleteDataset(pidA); + await deleteDataset(pidB); + }); + + it("0620: removing a key from the last dataset deletes the MetadataKey entirely", async () => { + const pid = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { sole_key: { value: 42 } }, + }); + + const filterKey = { where: { sourceType: "Dataset", key: "sole_key" } }; + + const before = await getMetadataKeys(filterKey, accessTokenAdmin); + before.body.should.have.lengthOf(1); + before.body[0].usageCount.should.equal(1); + + await patchDataset(pid, { scientificMetadata: {} }); + + const after = await getMetadataKeys(filterKey, accessTokenAdmin); + after.body.should.have.lengthOf(0); + + await deleteDataset(pid); + }); + + it("0630: adding a new key to scientificMetadata creates a new MetadataKey", async () => { + const pid = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { original_key: { value: 1 } }, + }); + + const filterNewKey = { + where: { sourceType: "Dataset", key: "brand_new_key" }, }; - return request(appUrl) - .get( - `/api/v4/metadatakeys?filter=${encodeURIComponent(JSON.stringify(filter))}`, - ) - .set("Accept", "application/json") - .expect(TestData.SuccessfulGetStatusCode) - .expect("Content-Type", /json/) - .then((res) => { - res.body.should.be.an("array").that.satisfies((arr) => { - return arr.every((item) => item.isPublished === true); - }); - }); - }); - - it("0030: should allow authenticated user to list metadata keys they have access", async () => { - const filter = { limits: { limit: 1, skip: 0 } }; - - return request(appUrl) - .get( - `/api/v4/metadatakeys?filter=${encodeURIComponent(JSON.stringify(filter))}`, - ) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenUser1}` }) - .expect(TestData.SuccessfulGetStatusCode) - .expect("Content-Type", /json/) - .then((res) => { - res.body.should.be.an("array").that.satisfies((arr) => { - return arr.every((item) => { - return ( - item.isPublished === true || item.userGroups.includes("group1") - ); - }); - }); - }); - }); - - it("0040: should return empty array when user queries keys they don't have access to", async () => { - const filter = { - where: { sourceType: "dataset", sourceId: datasetIdPrivate }, - limits: { limit: 10, skip: 0 }, + // Key does not exist yet + const before = await getMetadataKeys(filterNewKey, accessTokenAdmin); + before.body.should.have.lengthOf(0); + + await patchDataset(pid, { + scientificMetadata: { + original_key: { value: 1 }, + brand_new_key: { value: 99, human_name: "Brand New Key" }, + }, + }); + + const after = await getMetadataKeys(filterNewKey, accessTokenAdmin); + after.body.should.have.lengthOf(1); + after.body[0].key.should.equal("brand_new_key"); + after.body[0].humanReadableName.should.equal("Brand New Key"); + after.body[0].usageCount.should.equal(1); + + await deleteDataset(pid); + }); + + it("0640: changing ownerGroup updates userGroups — old group removed, new group added", async () => { + const pid = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { group_change_key: { value: 1 } }, + }); + + const filterKey = { + where: { sourceType: "Dataset", key: "group_change_key" }, }; - return request(appUrl) - .get( - `/api/v4/metadatakeys?filter=${encodeURIComponent(JSON.stringify(filter))}`, - ) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenUser1}` }) - .expect(TestData.SuccessfulGetStatusCode) - .expect("Content-Type", /json/) - .then((res) => { - res.body.should.be.an("array").and.have.length(0); - }); - }); - - it("0040: should return metadatakeys with correct access for user1", async () => { - const filter = { - where: { sourceType: "dataset", sourceId: `${datasetIdUser1}` }, - limits: { limit: 10, skip: 0 }, + const before = await getMetadataKeys(filterKey, accessTokenAdmin); + before.body.should.have.lengthOf(1); + before.body[0].userGroups.should.include("group1"); + before.body[0].userGroups.should.not.include("group2"); + + await patchDataset(pid, { ownerGroup: "group2" }); + + const after = await getMetadataKeys(filterKey, accessTokenAdmin); + after.body.should.have.lengthOf(1); + after.body[0].userGroups.should.include("group2"); + after.body[0].userGroups.should.not.include("group1"); + + // user1 (group1) can no longer see this key + const user1Res = await getMetadataKeys(filterKey, accessTokenUser1); + user1Res.body.should.have.lengthOf(0); + + await deleteDataset(pid); + }); + + it("0650: deleting a dataset decrements usageCount — deletes MetadataKey when it reaches 0", async () => { + const pidA = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { delete_test_key: { value: 1 } }, + }); + + const pidB = await createDataset({ + ...TestData.DatasetWithScientificMetadataV4, + ownerGroup: "group1", + accessGroups: [], + isPublished: false, + scientificMetadata: { delete_test_key: { value: 2 } }, + }); + + const filterKey = { + where: { sourceType: "Dataset", key: "delete_test_key" }, }; - return request(appUrl) - .get(`/api/v4/metadatakeys?filter=${JSON.stringify(filter)}`) - .set("Accept", "application/json") - .set({ Authorization: `Bearer ${accessTokenUser1}` }) - .expect(TestData.SuccessfulGetStatusCode) - .expect("Content-Type", /json/) - .then((res) => { - res.body.should.be.an("array").and.have.length.of.at.least(1); - }); + const before = await getMetadataKeys(filterKey, accessTokenAdmin); + before.body.should.have.lengthOf(1); + before.body[0].usageCount.should.equal(2); + + // Delete first dataset — usageCount drops to 1, key still exists + await deleteDataset(pidA); + + const afterFirst = await getMetadataKeys(filterKey, accessTokenAdmin); + afterFirst.body.should.have.lengthOf(1); + afterFirst.body[0].usageCount.should.equal(1); + + // Delete second dataset — usageCount hits 0, MetadataKey removed + await deleteDataset(pidB); + + const afterSecond = await getMetadataKeys(filterKey, accessTokenAdmin); + afterSecond.body.should.have.lengthOf(0); }); }); diff --git a/test/TestData.js b/test/TestData.js index 204c8b1be..3c6d9a823 100644 --- a/test/TestData.js +++ b/test/TestData.js @@ -1437,10 +1437,12 @@ const TestData = { unit: "mbar l/s/cm^2", valueSI: 100000, unitSI: "kg / s^3", + human_name: "Pressure SI", }, with_number: { value: 111, unit: "", + human_name: "Sample Number", }, with_string: { value: "222", @@ -1464,12 +1466,12 @@ const TestData = { with_unit_and_value_si: { value: 100, unit: "mbar l/s/cm^2", - valueSI: 100000, - unitSI: "kg / s^3", + human_name: "Pressure SI", }, with_number: { value: 111, unit: "", + human_name: "Sample Number", }, with_string: { value: "222",