-
Notifications
You must be signed in to change notification settings - Fork 507
Open
Labels
type/featureFeature requestFeature request
Description
🚀 Feature
Add GRPO Support
Motivation
With the release of DeepSeek's R1 model, GRPO has been shown to be a powerful way to instill reasoning capabilities in models for cases where there is either labeled data or a verifier. This request is to add support to train a model with GRPO, perhaps with a focus on building reasoning abilities.
Metadata
Metadata
Assignees
Labels
type/featureFeature requestFeature request