Skip to content

Migrate analyzer test from MCCL to ForkBasedTestDriver#2011

Open
Scusemua wants to merge 3 commits intometa-pytorch:mainfrom
Scusemua:export-D100079406
Open

Migrate analyzer test from MCCL to ForkBasedTestDriver#2011
Scusemua wants to merge 3 commits intometa-pytorch:mainfrom
Scusemua:export-D100079406

Conversation

@Scusemua
Copy link
Copy Markdown
Contributor

@Scusemua Scusemua commented Apr 9, 2026

Summary: Replace CollectiveIntegrationTestMixin with ForkBasedTestDriver in the analyzer test. Use meta::comms::CudaStream. Include inline modifyGPUBuffer/validateGPUBuffer helpers (duplicated from MCCL's McclIntegrationTestUtil).

Differential Revision: D100079406

Scusemua added 3 commits April 9, 2026 11:08
…ch#2006)

Summary:

Introduce a lightweight fork+exec test driver that re-execs the test binary as worker subprocesses with TCPStore-based coordination. This enables ncclx tests to inspect worker exit codes (e.g., watchdog crash tests) without depending on MCCL's heavier Thrift-based `CollectiveIntegrationTestMixin`.

Also adds unit tests (`ForkBasedTestDriverTest.cc`) covering:
- Basic multi-rank success with KV round-trip
- Exact exit code capture
- Signal-terminated worker reporting (128 + signal)

Differential Revision: D100079492
…ta-pytorch#2010)

Summary:

Replace `CollectiveIntegrationTestMixin` with `ForkBasedTestDriver` in watchdog tests. Use `meta::comms::CudaStream` from `comms/utils/CudaRAII.h` instead of MCCL's `mccl::cuda::CudaStream`.

Differential Revision: D100079493
Summary: Replace `CollectiveIntegrationTestMixin` with `ForkBasedTestDriver` in the analyzer test. Use `meta::comms::CudaStream`. Include inline `modifyGPUBuffer`/`validateGPUBuffer` helpers (duplicated from MCCL's `McclIntegrationTestUtil`).

Differential Revision: D100079406
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Apr 9, 2026
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync bot commented Apr 9, 2026

@Scusemua has exported this pull request. If you are a Meta employee, you can view the originating Diff in D100079406.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant