SPO Code for Semiparametric Preference Optimization: Your Language Model is Secretly a Single-Index Model Nathan Kallus https://arxiv.org/abs/2512.21917 Synthetic preference optimization experiment See README.md in synthetic directory. Aligning Qwen3 on UltraFeedback See README.md in ultrafeedback directory.