Add Cosine LR Scheduler for Fine-Tuning#273
Conversation
src/together/cli/api/finetune.py
Outdated
| "--num-cycles", | ||
| type=float, | ||
| default=0.5, | ||
| help="Number of cycles for cosine learning rate scheduler.", |
There was a problem hiding this comment.
| help="Number of cycles for cosine learning rate scheduler.", | |
| help="Number of cycles for the cosine learning rate scheduler.", |
There was a problem hiding this comment.
nit: Maybe also add what fractional values mean.
There was a problem hiding this comment.
Updated to "Number or fraction of cycles".
src/together/resources/finetune.py
Outdated
| lr_scheduler_args=FinetuneLinearLRSchedulerArgs(min_lr_ratio=min_lr_ratio), | ||
| ) | ||
| if lr_scheduler_type == "cosine": | ||
| if num_cycles <= 0.0: |
There was a problem hiding this comment.
nit: maybe also add some meaningful upperbound?
There was a problem hiding this comment.
Hard to know what a reasonable upperbound would be without knowing the number of steps afaik. I think it makes sense to follow the hf implementation and let the cosine alias if the user inputs something unreasonably large for their job.
src/together/resources/finetune.py
Outdated
| learning_rate: float | None = 0.00001, | ||
| lr_scheduler_type: Literal["linear", "cosine"] = "linear", | ||
| min_lr_ratio: float = 0.0, | ||
| num_cycles: float = 0.5, |
There was a problem hiding this comment.
Let's call it scheduler_num_cycles for clarity?
There was a problem hiding this comment.
The backend expects the field in the request to be num_cycles, but in the scheduler args. Would just changing the CLI arg be ok?
There was a problem hiding this comment.
Changed CLI arg to scheduler_num_cycles. Worth noting FinetuneCosineLRSChedulerArgs still has the field as num_cycles
Have you read the Contributing Guidelines?
Yes
Issue # ENG-23270
Adds support for Cosine LR Scheduler in the fine-tuning api as well as simplifies future support of additional schedulers (https://github.com/togethercomputer/together-finetune/pull/1349)