Skip to content

feat: implement nuget registry worker [CM-1276]#4265

Draft
mbani01 wants to merge 3 commits into
mainfrom
feat/nuget-worker
Draft

feat: implement nuget registry worker [CM-1276]#4265
mbani01 wants to merge 3 commits into
mainfrom
feat/nuget-worker

Conversation

@mbani01

@mbani01 mbani01 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

This pull request introduces support for NuGet package enrichment in the packages_worker service. It adds a new NuGet worker with all necessary scripts, configuration, and logic to fetch, normalize, and store NuGet package metadata, including download counts and maintainers. Additionally, it updates the database schema to support tracking total downloads for packages.

NuGet worker integration and orchestration:

  • Added a new nuget-worker service, including Docker Compose configuration (scripts/services/nuget-worker.yaml), and updated the build and service scripts to include the NuGet worker (scripts/builders/packages.env, services/apps/packages_worker/package.json). [1] [2] [3]
  • Registered the processNuGetBatch activity in the worker's activities export (services/apps/packages_worker/src/activities.ts).

NuGet enrichment logic:

  • Implemented the NuGet batch enrichment loop, including fetching package data from NuGet, normalizing it, and upserting it into the database, with logic for handling maintainers, downloads, and error cases (services/apps/packages_worker/src/nuget/runNuGetEnrichmentLoop.ts, services/apps/packages_worker/src/nuget/activities.ts). [1] [2]
  • Added NuGet API client for resolving endpoints, fetching search and registration data, and handling errors (services/apps/packages_worker/src/nuget/client.ts).
  • Added normalization logic to convert NuGet registry data into the internal package format (services/apps/packages_worker/src/nuget/normalize.ts).

Database schema changes:

  • Added a total_downloads column to the packages table to track NuGet download counts (backend/src/osspckgs/migrations/V1782345600__nuget_total_downloads.sql).

Configuration:

  • Added NuGet-specific configuration options for batch size, concurrency, delays, and user agent (services/apps/packages_worker/src/config.ts).

Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
@mbani01 mbani01 self-assigned this Jun 25, 2026
Copilot AI review requested due to automatic review settings June 25, 2026 12:33
@CLAassistant

Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds first-class NuGet package enrichment to the packages_worker service, including a new Temporal worker entrypoint, NuGet API client + normalization pipeline, DAL upsert helpers, and a schema change to store NuGet total download counts.

Changes:

  • Added a NuGet worker (schedule, workflow, activities, CLI triggers) to ingest NuGet package metadata into packages-db.
  • Implemented NuGet registry client + normalization and wired it into a batch enrichment loop with maintainers, versions, and download snapshot handling.
  • Added packages.total_downloads via a migration and exposed NuGet DAL helpers via the data-access-layer index.

Reviewed changes

Copilot reviewed 18 out of 19 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
services/libs/data-access-layer/src/osspckgs/nuget.ts New DAL helpers for listing NuGet packages to sync, upserting package/version metadata, and recording download snapshots.
services/libs/data-access-layer/src/index.ts Exports NuGet DAL module.
services/apps/packages_worker/src/workflows/index.ts Exposes NuGet ingestion workflow from the worker’s workflow index.
services/apps/packages_worker/src/scripts/triggerOsvSync.ts Adds a manual script to trigger the OSV sync workflow.
services/apps/packages_worker/src/scripts/triggerNuGetSync.ts Adds a manual script to trigger NuGet ingestion.
services/apps/packages_worker/src/nuget/workflows.ts NuGet Temporal workflow that continues-as-new while work remains.
services/apps/packages_worker/src/nuget/types.ts NuGet config + API/normalization types and helpers.
services/apps/packages_worker/src/nuget/schedule.ts Registers a daily Temporal Schedule for NuGet ingestion.
services/apps/packages_worker/src/nuget/runNuGetEnrichmentLoop.ts Implements the NuGet batch processing/enrichment loop.
services/apps/packages_worker/src/nuget/normalize.ts Normalizes NuGet search/registration responses into internal package/version shapes.
services/apps/packages_worker/src/nuget/client.ts Resolves NuGet service index endpoints and fetches search/registration data.
services/apps/packages_worker/src/nuget/activities.ts Temporal activity wrapper for batch processing.
services/apps/packages_worker/src/config.ts Adds NuGet worker configuration (batch size, concurrency, delays, UA, critical-only).
services/apps/packages_worker/src/bin/nuget-worker.ts Worker entrypoint to init service, register schedule, and start.
services/apps/packages_worker/src/activities.ts Exports processNuGetBatch activity.
services/apps/packages_worker/package.json Adds start/dev scripts for the new nuget-worker.
scripts/services/nuget-worker.yaml Docker compose service definitions for NuGet worker (prod/dev).
scripts/builders/packages.env Adds nuget-worker to packaged services list.
backend/src/osspckgs/migrations/V1782345600__nuget_total_downloads.sql Adds packages.total_downloads column for NuGet download counts.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread services/libs/data-access-layer/src/osspckgs/nuget.ts
Comment thread services/libs/data-access-layer/src/osspckgs/nuget.ts
Comment thread services/libs/data-access-layer/src/osspckgs/nuget.ts
Comment thread services/libs/data-access-layer/src/osspckgs/nuget.ts
mbani01 added 2 commits June 25, 2026 14:31
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Signed-off-by: Mouad BANI <mouad-mb@outlook.com>
Copilot AI review requested due to automatic review settings June 25, 2026 13:41

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 19 changed files in this pull request and generated 7 comments.

Comment on lines +88 to +91
if (registrationResult.kind === 'RATE_LIMIT') {
log.warn({ purl: pkg.purl }, 'Rate limited by NuGet registry — will retry next pass')
return 'error'
}
Comment on lines +224 to +228
} catch (err) {
const message = err instanceof Error ? err.message : String(err)
log.error({ purl: pkg.purl, error: message }, 'Unexpected error processing package')
counts.error++
}
Comment on lines +73 to +80
export function getNuGetConfig() {
return {
batchSize: parseInt(process.env.NUGET_FETCHER_BATCH_SIZE ?? '1000', 10),
concurrency: parseInt(process.env.NUGET_FETCHER_CONCURRENCY ?? '20', 10),
groupDelayMs: parseInt(process.env.NUGET_FETCHER_GROUP_DELAY_MS ?? '0', 10),
isCritical: (process.env.NUGET_FETCHER_IS_CRITICAL ?? 'false') === 'true',
}
}
Comment on lines +82 to +85
headers: {
'Accept-Encoding': 'gzip',
},
timeout: 15000,
Comment on lines +101 to +104
const resp = await axios.get<NuGetRegistrationPage>(pageId, {
headers: { 'Accept-Encoding': 'gzip' },
timeout: 15000,
})
Comment on lines +117 to +120
{
headers: { 'Accept-Encoding': 'gzip' },
timeout: 15000,
},
Comment on lines +45 to +49
export function normalizeNuGetPackage(
packageId: string,
searchResult: NuGetSearchItem | null,
registration: NuGetRegistrationIndex,
): NormalizedNuGetPackage {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants