PEP 723 parser + silent detection telemetry#1555
Merged
Merged
Conversation
- inlineScriptMetadata: switch BLOCK_RE consumption to text.matchAll() so the global regex's lastIndex is never shared/mutated across calls. - inlineScriptLazyDetector: fix save-event coalescing bug by re-enqueuing a fresh read when a save races with an in-flight open (the previous cache may be stale). - inlineScriptLazyDetector: add a disposed flag and guard processOnce against post-disposal project registration. - inlineScriptDetector (creator): use uri.toString() for URI equality to match InlineScriptLazyDetector and avoid Windows drive-letter / trailing-separator divergence. - inlineScriptDetector (creator): replace showErrorMessage with showInformationMessage for the four "no scripts found" toasts (informational, not an error). - tests: replace the open+save coalescing test with two tests covering open-dedup and save-re-read separately; add a dispose-during-in-flight-read test; switch detector tests to stub showInformationMessage.
Strip user-facing PEP 723 inline-script-metadata wiring while keeping the lazy detector wired up as the planned telemetry ingest point. - Remove the python-envs.useInlineScriptMetadata setting from package.json + package.nls.json. - Delete the bulk-scan InlineScriptDetector project creator and its unit test. - Drop the PEP 723 dependency-extraction branch from pipUtils.getProjectInstallable and its unit test. - Remove isInlineScriptMetadataEnabled + setting constants from common/inlineScriptMetadata; keep the parser. - Remove the inlineScriptMetadata field from PythonProjectsImpl. - Remove the InlineScriptStrings localization namespace. - Slim InlineScriptLazyDetector to a no-arg observer with a TODO(pep723-telemetry) marker; rewrite its unit tests for the new shape. - Re-enable the lazy detector in extension.ts with an updated comment describing the telemetry-observer role.
Wires the existing silent inline-script-metadata detector to emit two anonymized telemetry events: - PEP723.DETECTED: fires once per (URI, session) the first time a valid `# /// script` block is observed. Properties: `trigger` (open|save), `hasRequiresPython` (bool). Measure: `dependencyCount` (int). This is the denominator for the `how many users actually see PEP 723 files` question. - PEP723.EDITED: fires once per (URI, session) the first time a previously-detected URI receives a non-empty content change. Measure: `duration` (ms since detection). Together with DETECTED this distinguishes viewers from editors. No URIs, paths, dependency names, or version strings are sent. Adds an `onDidChangeTextDocument` wrapper to `workspace.apis.ts` so the detector can subscribe through the same abstraction layer used for open/save. Extends the detector unit tests from 16 to 26 cases covering both events, per-URI dedup, empty-contentChanges no-ops, and disposal suppression.
…ge.json reformat - src/extension.ts: update the inline comment beside the `InlineScriptLazyDetector` activation site to describe its actual behavior (emits anonymized telemetry) instead of the stale `feature is not shipped` wording from the slim-down commit. - package.json: revert an unintended multi-line re-pretty-print of `python-envs.workspaceSearchPaths.default`. The branch now has zero drift on package.json against upstream/main.
edvilme
approved these changes
Jun 2, 2026
edvilme
left a comment
Contributor
There was a problem hiding this comment.
One comment we can implement later, but looks good to me :)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why we're doing this
PEP 723 ("inline script metadata") lets a single-file Python script declare its own runtime requirements in a TOML block at the top of the file:
Runners like
uv run --script,pipx run, andhatch runalready read this block to set up an environment automatically. Today,ms-python.vscode-python-envsdoes nothing with it.Before we commit to any of that work, we want to answer a basic question: how many users of this extension actually have PEP 723 scripts in their workspaces, and how many edit them vs. just open them?
The follow-on stages involve UX choices that are still being worked through in design review — environment backend (venv vs uv vs cache-discovery), persistence model (sibling
.venvvs content-addressed cache), the trigger surface for env creation, and so on. Sizing the population first lets us prioritize honestly instead of building speculatively.This PR is the measurement-only first slice: a parser plus a silent telemetry observer. There is no UI, no setting, no project registration, and no environment creation. The extension behaves exactly as it does today; the only change a user could observe is two new entries in their telemetry stream.
What changed and how it works
src/common/inlineScriptMetadata.ts— the parserPure-function code that every later stage will build on. Exports:
readInlineScriptMetadata(text)requires-python,dependencies, opaque[tool]table, and the block's character offsets — orundefinedfor no block, malformed block, multiple blocks (per spec MUST error), or TOML errors.readInlineScriptMetadataFromFile(uri)fs.open+fileHandle.read. Skips non-file:URIs. Swallows I/O errors.matchesPythonVersion(specifier, version)==,!=,>=,<=,>,<,~=,===; comma-separated AND;==3.12.*wildcards.Handles the edges that come up in real files: leading UTF-8 BOM (Windows), CRLF and lone-CR line endings (normalized before regex), shebang lines, encoding declarations, the spec's content-extraction rule (
# foo→foo, bare#→ blank line; rejects#foo,##foo,#\tfoo), multiplescriptblocks (returnsundefined+traceWarn), and unknown top-level block types (silently ignored, per spec). Test coverage in src/test/common/inlineScriptMetadata.unit.test.ts (~30 cases).src/features/inlineScriptLazyDetector.ts— the silent observerSubscribes to
onDidOpenTextDocument,onDidSaveTextDocument, andonDidChangeTextDocument. For each.pyfile inside an open workspace folder, on open or save, it reads the first 8 KiB and runs the parser. When a valid block is detected, it emits anonymized telemetry — and that's it. No projects registered, no UI shown.Behaviour worth flagging:
PEP723.DETECTEDfires at most once per URI per session; repeat opens/saves of the same file are silent.PEP723.EDITEDonly fires for URIs already counted as detected, and only on the firstcontentChanges-bearing event.inFlightmap..pydocument through the same handler. The extension'sonLanguage:pythonactivation event fires after VS Code has opened any restored editors, so the open events for those files are gone by the time we subscribe. The replay is deferred viasetImmediateto avoid racing VS Code's own document registration; per-URI dedup keeps it idempotent if a live event happens to arrive too.disposedflag guards async continuations so a read that completes afterdispose()does not emit telemetry on a torn-down host.python-envs.useInlineScriptMetadatasetting reserved in plan.md becomes relevant in Stage 2+ when we start writing to disk.src/common/telemetry/constants.ts— the two eventsTogether, the events answer two questions: how many users have PEP 723 files at all (
DETECTEDcount), and how many of those users actually edit them rather than just opening them once (EDITED/DETECTEDratio). Full GDPR annotations land alongside the enum members in the same file.What is not in the events: no URIs, no file paths, no file content, no workspace identifiers, no project names. The events are pure counters plus a small set of shape metadata.
Supporting changes
src/common/workspace.apis.ts— wrapper exports foronDidOpenTextDocument,onDidSaveTextDocument,onDidChangeTextDocument, andgetOpenTextDocuments, matching the existing wrapper pattern for testability.src/extension.ts— constructs and activates the observer alongside the existing project creators; pushes it ontocontext.subscriptionsfor disposal.src/test/features/inlineScriptLazyDetector.unit.test.ts— covers open/save dispatch, activation replay, per-URI dedup, edit-event gating, in-flight coalescing, and disposal safety.