Code for "Evolutionary System Prompt Learning for Reinforcement Learning in LLMs" (arxiv.org/abs/2602.14697)
-
Updated
Feb 26, 2026 - Python
Code for "Evolutionary System Prompt Learning for Reinforcement Learning in LLMs" (arxiv.org/abs/2602.14697)
Sudoku4LLM is a Sudoku dataset generator for training and evaluating reasoning in Large Language Models (LLMs). It offers customizable puzzles, difficulty levels, and 11 serialization formats to support structured data reasoning and Chain of Thought (CoT) experiments.
Add a description, image, and links to the rl-for-llm topic page so that developers can more easily learn about it.
To associate your repository with the rl-for-llm topic, visit your repo's landing page and select "manage topics."