Skip to content

Fix example max_steps calculations#1629

Open
fallintoplace wants to merge 1 commit into
google:mainfrom
fallintoplace:fix/max-steps-formulas
Open

Fix example max_steps calculations#1629
fallintoplace wants to merge 1 commit into
google:mainfrom
fallintoplace:fix/max-steps-formulas

Conversation

@fallintoplace

Copy link
Copy Markdown

What changed

This updates the example launch scripts that were computing max_steps as batch_size * num_batches * num_train_epochs * train_fraction.

Those scripts now compute max_steps from num_batches * num_train_epochs * train_fraction, which matches the CLI's step-based semantics. Since warmup_steps and decay_steps are derived from max_steps in these scripts, they now stay aligned as well.

Why

tunix.cli.grpo_main treats max_steps as optimizer/training steps and caps it against num_batches * num_train_epochs * train_fraction.

A handful of example scripts were multiplying by batch_size, which effectively turned the value into a sample count. That could overrun training by batch_sizex and skew warmup/decay schedules relative to the actual number of optimizer steps.

Impact

The affected examples now match the repo's documented and implemented max_steps behavior.

Validation

  • rg -n '\$batch_size \* \$num_batches \* \$num_train_epochs \* \$train_fraction' . -g '*.sh'
  • bash -n on the 8 updated scripts
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants