-
Notifications
You must be signed in to change notification settings - Fork 76
LM Workload TF32 #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LM Workload TF32 #900
Conversation
Dev -> main
…w pytorch" This reverts commit 6f7d638.
Fix slow ImageNet workloads on PyTorch
add mixed precision training for lm workload
Revert "add mixed precision training for lm workload"
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
|
This branch has the mixed precision code changes for all of the workloads. We have an existing PR open from the lm_workload_base branch that only has the LM workload changes. Note I did not do a final test with the mixed precision for just the LM workload after increasing the number of evals to get a better timing estimates. At this point we may just opt to do LM workload with TF32 for the new release since we don't have anymore bandwidth to test changes. |
This is the PR for the LM workload with mixed-precision training on 4xA100.