feat: implement nuget registry worker [CM-1276]#4265
Draft
mbani01 wants to merge 3 commits into
Draft
Conversation
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
|
|
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds first-class NuGet package enrichment to the packages_worker service, including a new Temporal worker entrypoint, NuGet API client + normalization pipeline, DAL upsert helpers, and a schema change to store NuGet total download counts.
Changes:
- Added a NuGet worker (schedule, workflow, activities, CLI triggers) to ingest NuGet package metadata into
packages-db. - Implemented NuGet registry client + normalization and wired it into a batch enrichment loop with maintainers, versions, and download snapshot handling.
- Added
packages.total_downloadsvia a migration and exposed NuGet DAL helpers via the data-access-layer index.
Reviewed changes
Copilot reviewed 18 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| services/libs/data-access-layer/src/osspckgs/nuget.ts | New DAL helpers for listing NuGet packages to sync, upserting package/version metadata, and recording download snapshots. |
| services/libs/data-access-layer/src/index.ts | Exports NuGet DAL module. |
| services/apps/packages_worker/src/workflows/index.ts | Exposes NuGet ingestion workflow from the worker’s workflow index. |
| services/apps/packages_worker/src/scripts/triggerOsvSync.ts | Adds a manual script to trigger the OSV sync workflow. |
| services/apps/packages_worker/src/scripts/triggerNuGetSync.ts | Adds a manual script to trigger NuGet ingestion. |
| services/apps/packages_worker/src/nuget/workflows.ts | NuGet Temporal workflow that continues-as-new while work remains. |
| services/apps/packages_worker/src/nuget/types.ts | NuGet config + API/normalization types and helpers. |
| services/apps/packages_worker/src/nuget/schedule.ts | Registers a daily Temporal Schedule for NuGet ingestion. |
| services/apps/packages_worker/src/nuget/runNuGetEnrichmentLoop.ts | Implements the NuGet batch processing/enrichment loop. |
| services/apps/packages_worker/src/nuget/normalize.ts | Normalizes NuGet search/registration responses into internal package/version shapes. |
| services/apps/packages_worker/src/nuget/client.ts | Resolves NuGet service index endpoints and fetches search/registration data. |
| services/apps/packages_worker/src/nuget/activities.ts | Temporal activity wrapper for batch processing. |
| services/apps/packages_worker/src/config.ts | Adds NuGet worker configuration (batch size, concurrency, delays, UA, critical-only). |
| services/apps/packages_worker/src/bin/nuget-worker.ts | Worker entrypoint to init service, register schedule, and start. |
| services/apps/packages_worker/src/activities.ts | Exports processNuGetBatch activity. |
| services/apps/packages_worker/package.json | Adds start/dev scripts for the new nuget-worker. |
| scripts/services/nuget-worker.yaml | Docker compose service definitions for NuGet worker (prod/dev). |
| scripts/builders/packages.env | Adds nuget-worker to packaged services list. |
| backend/src/osspckgs/migrations/V1782345600__nuget_total_downloads.sql | Adds packages.total_downloads column for NuGet download counts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Comment on lines
+88
to
+91
| if (registrationResult.kind === 'RATE_LIMIT') { | ||
| log.warn({ purl: pkg.purl }, 'Rate limited by NuGet registry — will retry next pass') | ||
| return 'error' | ||
| } |
Comment on lines
+224
to
+228
| } catch (err) { | ||
| const message = err instanceof Error ? err.message : String(err) | ||
| log.error({ purl: pkg.purl, error: message }, 'Unexpected error processing package') | ||
| counts.error++ | ||
| } |
Comment on lines
+73
to
+80
| export function getNuGetConfig() { | ||
| return { | ||
| batchSize: parseInt(process.env.NUGET_FETCHER_BATCH_SIZE ?? '1000', 10), | ||
| concurrency: parseInt(process.env.NUGET_FETCHER_CONCURRENCY ?? '20', 10), | ||
| groupDelayMs: parseInt(process.env.NUGET_FETCHER_GROUP_DELAY_MS ?? '0', 10), | ||
| isCritical: (process.env.NUGET_FETCHER_IS_CRITICAL ?? 'false') === 'true', | ||
| } | ||
| } |
Comment on lines
+82
to
+85
| headers: { | ||
| 'Accept-Encoding': 'gzip', | ||
| }, | ||
| timeout: 15000, |
Comment on lines
+101
to
+104
| const resp = await axios.get<NuGetRegistrationPage>(pageId, { | ||
| headers: { 'Accept-Encoding': 'gzip' }, | ||
| timeout: 15000, | ||
| }) |
Comment on lines
+117
to
+120
| { | ||
| headers: { 'Accept-Encoding': 'gzip' }, | ||
| timeout: 15000, | ||
| }, |
Comment on lines
+45
to
+49
| export function normalizeNuGetPackage( | ||
| packageId: string, | ||
| searchResult: NuGetSearchItem | null, | ||
| registration: NuGetRegistrationIndex, | ||
| ): NormalizedNuGetPackage { |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces support for NuGet package enrichment in the
packages_workerservice. It adds a new NuGet worker with all necessary scripts, configuration, and logic to fetch, normalize, and store NuGet package metadata, including download counts and maintainers. Additionally, it updates the database schema to support tracking total downloads for packages.NuGet worker integration and orchestration:
nuget-workerservice, including Docker Compose configuration (scripts/services/nuget-worker.yaml), and updated the build and service scripts to include the NuGet worker (scripts/builders/packages.env,services/apps/packages_worker/package.json). [1] [2] [3]processNuGetBatchactivity in the worker's activities export (services/apps/packages_worker/src/activities.ts).NuGet enrichment logic:
services/apps/packages_worker/src/nuget/runNuGetEnrichmentLoop.ts,services/apps/packages_worker/src/nuget/activities.ts). [1] [2]services/apps/packages_worker/src/nuget/client.ts).services/apps/packages_worker/src/nuget/normalize.ts).Database schema changes:
total_downloadscolumn to thepackagestable to track NuGet download counts (backend/src/osspckgs/migrations/V1782345600__nuget_total_downloads.sql).Configuration:
services/apps/packages_worker/src/config.ts).