Selfplay league runner with flexible configuration and algo presets#58
Selfplay league runner with flexible configuration and algo presets#58
Conversation
Nice! This is really cool!
This makes sense! Lots of projects could benefit from this :)
I will look further into this and evaluate. |
|
Looked further into this. Do we have a sense of how "fast" the training is? So if we run |
|
To answer this question I need to have a GPU 😀 And based on the schedule of other experiments I will be able to run it tomorrow or the day after tomorrow |
|
Oh, BTW. Found another detail I have to flesh out first: right now evaluation only works against other agents, PvE games are not supported. It's quite easy to cover, so shouldn't take long |
|
What is PvE? |
|
Sorry :) It stands for "player vs. environment" (like, built-in bot). In comparison to PvP, as "player vs. player" |
|
Oh, this makes sense. It would be useful to cover! That said, hopefully, we can also train really strong agents without the help of human-engineered bots :) |

This is very much WIP.or
For league runner to work, it's required to provide 2 entrypoints:
trainandevaluate. Both take as an argument path to saved agent and saved opponent.League configuration (in YAML file) gives ability to control:
The runner keeps track of winrates in a payoff table and MMR by running Bayesian update on TrueSkill. Information on winrates and MMR could be used for making decisions on the next opponent.
As an example, 2 presets are implemented:
The league supports resume from checkpoint.
There are still a lot of issues, including
It's actually pretty hard to iterate on the league runner with MicroRTS env, as it takes long time to get any training done. I'm mostly iterating on SlimeVolley, and some other PettingZoo envs. And I'm think about having league runner as a separate package that could be used as a library and/or CLI tool (from implementation perspective, it's completely independent from details of training or env that is used). WDYT @vwxyzjn?