This code is built upon the Med-R1 repo, it supports GKD, GRPO and KEPO algorithns on sota models such as Qwen-3VL. The dataset mainly focus on the multimodal medical data.
-
Notifications
You must be signed in to change notification settings - Fork 0
Corleno/KEPO
About
KEPO: Knowledge-Enhanced Preference Optimization for Reinforcement Learning with Reasoning
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published