Make RestoreTask resettable across builds (NuGet/Home#14958 item 1)#7507
Open
JanProvaznik wants to merge 9 commits into
Open
Make RestoreTask resettable across builds (NuGet/Home#14958 item 1)#7507JanProvaznik wants to merge 9 commits into
JanProvaznik wants to merge 9 commits into
Conversation
A small public keyed registry: RegisterResetAction(ResetKey, Action) accumulates resets under a key and Reset(ResetKey) runs them (best-effort, isolated). Keys: StartRestore (refresh env-derived caches) and EndRestore (tear down live resources). This is the only new public API; each cache/resource registers its own internal reset, so nothing else becomes public. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
RestoreTask.Execute calls Reset(StartRestore) before restore and Reset(EndRestore) in a finally. RestoreTask runs once per restore on the entry node, so these are direct calls with no guard. Makes a process reused across builds (MSBuild Server) behave as if started fresh. No-op until caches register (following commits). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ExceptionLogger (NUGET_SHOW_STACK), ConcurrencyUtilities (lock base path) and NuGetEnvironment (home/temp/folder path caches) self-register a reset from their static constructors. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
ProxyCache.Instance is rebuilt so a reused process re-reads the http_proxy family of env vars / nuget.config and drops cached proxy credentials. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PreviewFeatureSettings re-reads its env flag; DefaultCredentialServiceUtility discards the pinned credential service so a reused process rebuilds it for the next restore instead of reusing the first build's interactivity/providers/cached credentials. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Re-reads its env-derived hash-function flag. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
NuGetFeatureFlags, NuGetTestMode and PackageIdValidator re-read their env flags; HttpSourceResourceProvider.Throttle (the per-restore concurrency semaphore) is cleared so it does not leak into the next restore. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ore) Recreates the NUGET_CONCURRENCY_LIMIT-sized semaphore from the current environment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PluginManager registers a private ResetSharedInstance under EndRestore that disposes the shared instance, killing credential/plugin child processes and their timers that the per-build process exit used to reclaim. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closed
4 tasks
Author
|
@OvesN @AR-May — requesting your review on this one (item 1 of NuGet/Home#14958: make |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bug
Progress: NuGet/Home#14958 (Plan item 1 — Make RestoreTask resettable)
Description
MSBuild is moving to an execution model where the entry-point process persists across builds (MSBuild Server, and later the multithreaded server).
RestoreTaskhas always relied on the process dying after each build to drop its cached environment-variable-derived state and to reclaim the credential/plugin child processes. In a reused process that state would leak between restores — stale env-derived caches, a credential service pinned to the first build's credentials, and lingering plugin processes.This PR implements item 1 of NuGet/Home#14958: make
RestoreTaskexplicitly reset that state — refresh env-derived caches at the start of restore and tear down live resources (plugins) at the end — so a reused process behaves as if it had started fresh.Approach — small public registry, internal self-registration
The only new public API is a tiny keyed registry in NuGet.Common:
Each cache/resource self-registers its own internal reset from its static constructor, so the individual resets stay
internal/privateand nothing else becomes public.RestoreTask.Executedrives the lifecycle:Reset(StartRestore)before restore,Reset(EndRestore)in afinally.RestoreTaskruns once per restore on the entry node, so these are direct calls with no guard.Reviewable commit-by-commit
NuGetProcessStateregistry (NuGet.Common) — the infrastructure + unit tests + the single PublicAPI.Unshipped entry.RestoreTask—Reset(StartRestore)at start,Reset(EndRestore)infinally. (No-op until caches register, below.)ExceptionLogger(NUGET_SHOW_STACK),ConcurrencyUtilities(lock base path),NuGetEnvironment(home/temp/folder-path caches).ProxyCache(http_proxyfamily + nuget.config; drops cached proxy creds).PreviewFeatureSettings;DefaultCredentialServiceUtility(discards the pinned credential service so the next restore rebuilds it instead of reusing the first build's interactivity/providers/cached credentials).DependencyGraphSpec(env-derived hash-function flag).NuGetFeatureFlags,NuGetTestMode,PackageIdValidator,HttpSourceResourceProvider.Throttle(per-restore concurrency semaphore).SourceRepositoryDependencyProviderthrottle (NUGET_CONCURRENCY_LIMIT).PluginManagerdisposes the shared instance (kills credential/plugin child processes and their timers) underEndRestore.Completeness audit
Cross-checked against a whole-program, cross-assembly reachability analysis from the restore task entry points (13 assemblies, ~7,267 reachable methods). Every environment-variable-derived cached static reachable from
RestoreTaskis reset atStartRestore; the live plugin/credential processes are torn down atEndRestore. Items intentionally not reset, with rationale:UserAgent.UserAgentStringandX509TrustStoreare both re-initialized unconditionally inBuildTasksUtility.RestoreAsyncevery restore.RuntimeEnvironmentHelper.*(OS / process-name flags),PathUtility._isFileSystemCaseInsensitive, framework-mapping tables, resource managers, comparers, JSON serializers, assembly-version singletons,NuGetExtractionFileIO._unixPermissions(umask).GraphOperations.Cache,RuntimeGraph.Cache.X509ChainBuildPolicyFactory.Policy(NUGET_EXPERIMENTAL_CHAIN_BUILD_RETRY_POLICY).How the audit was performed (cross-assembly reachability analyzer)
The "every reachable static" claim above is not eyeballed — it comes from a purpose-built Roslyn analyzer: RestoreStateAnalyzer (README · full report.md).
A normal
DiagnosticAnalyzeronly sees one compilation and only metadata for referenced assemblies, so it cannot follow a call fromRestoreTask(inNuGet.Build.Tasks) into the body ofNuGet.Protocol,NuGet.Credentials, etc. To be accurate across assemblies, the tool instead:NuGet.Build.Tasksfor net10.0 (the CoreCLRdotnet restoreruntime) with/bl, and reads the exact csc command lines from the binlog — so source files, references,#defines (IS_CORECLR, …),nullableandlangversionmatch the shipping build exactly.CSharpCompilationper NuGet project, wiring inter-project references as RoslynCompilationReferences (symbols resolve into the source of every assembly) and running the same source generators. Result: 13 assemblies, 0 binding errors.IOperationtrees — following invocations / object creations / property & event accesses / static initializers, expanding virtual & interface dispatch to all source overrides transitively.Lazy/[ThreadStatic]/ …) and every lingering process-state sink (child processes, env edits/reads, timers, static event subscriptions, cwd/console/culture/registry/AppDomain mutations), each annotated with the set of restore tasks that reach it and a representative call path.The audit table above is the result of diffing that enumeration (filtered to
RestoreTask-reachable, environment-derived statics) against the set ofRegisterResetActioncall sites in this PR. The same report drove the e2e validation target list below.End-to-end validation
The combined behavior (this NuGet change + the matching MSBuild change that runs
RestoreTaskin the persistent process) was validated end-to-end on a coordinated VMR build using the two-PAT credential-leak scenario from dotnet/msbuild#13660: restore #1 with PAT-A against a private Azure Artifacts feed, revoke PAT-A, restore #2 with PAT-B in the same MSBuild Server process. Onmainrestore #2 fails withNU1301: 401(the revoked PAT leaks from the cached plugin); with these resets restore #2 succeeds with PAT-B.RestoreTaskwas confirmed (via binlog) to run in-process in the persistent server, so the leak-prone path is genuinely exercised.PR Checklist
dotnet restore/buildshould work correctly when its MSBuild tasks run repeatedly and in the same process Home#14958; reviewable commit-by-commit.NuGetProcessStateTests(registry runs-all / failure-isolation, and re-read afterReset(StartRestore));ProxyCacheTests.ResetCache,HttpSourceResourceProviderTests.ResetThrottle,DefaultCredentialServiceUtilityTests.ResetCredentialService. FullNuGet.Build.Tasks.Testsuite green; build clean on net472 + net10.0.NuGet.Common.NuGetProcessState(the registry); every per-type reset is internal/private.