fix(recipes): default cross entropy fusion to native#4138
Conversation
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
|
Light Code Review -- Clean, well-scoped mechanical change. All cross_entropy_fusion_impl defaults switched from te to native. Verified no remaining te defaults. Guard test is solid (AST parsing). Completeness looks good. No bugs found. Suggested test cases: The change to scripts/performance/utils/overrides.py affects all perf workloads. No individual perf config files were modified. All perf baselines will need re-evaluation. |
|
Light Code Review Clean, well-scoped mechanical change. All cross_entropy_fusion_impl defaults switched from te to native across recipes, bridges, providers, perf overrides, and tests. Verified no remaining te defaults in src/ or scripts/. Observations
No bugs, typos, or missing coverage found. Suggested test cases The change to scripts/performance/utils/overrides.py (_set_common_perf_overrides) affects all perf workloads that have the cross_entropy_fusion_impl attribute, since this is the shared override path. No individual perf config files under scripts/performance/configs/ were modified and none directly set cross_entropy_fusion_impl, so there are no config-specific test cases to enumerate. All perf baselines will need to be re-evaluated after this lands since native CE fusion may have different throughput characteristics than TE CE fusion. |
Summary
cross_entropy_fusion_impl="te"to"native"while keepingcross_entropy_loss_fusion=True.cross_entropy_fusion_impl="te".Context
NVIDIA/Megatron-LM#5115 disables the Transformer Engine implementation of cross-entropy loss fusion because of training-stability issues. Native cross-entropy fusion remains available, so Bridge defaults should avoid producing
cross_entropy_loss_fusion=Truewithcross_entropy_fusion_impl="te".Validation
uv run --active --no-sync pre-commit run --all-filesuv run --active --no-sync ruff check <changed files>uv run --active --no-sync ruff format --check <changed files>git diff --checkuv run --active --no-sync python -m pytest --confcutdir=tests/unit_tests/recipes tests/unit_tests/recipes/test_cross_entropy_defaults.py -vTargeted recipe pytest suites were attempted locally, but full local collection is blocked by environment issues: project sync cannot install the pinned
nvidia-resiliency-ext==0.6.0wheel for this platform, and the active environment does not provide Transformer Engine for recipe module imports.