Random Selfplay by vwxyzjn · Pull Request #35 · Farama-Foundation/MicroRTS-Py

vwxyzjn · 2022-01-12T02:42:16Z

This PR prototypes fictitious selfplay.

Known issue: still needs to change the following variables to exclude data from agent2

        b_obs = obs.reshape((-1,) + envs.observation_space.shape)
        b_logprobs = logprobs.reshape(-1)
        b_actions = actions.reshape((-1,) + action_space_shape)
        b_advantages = advantages.reshape(-1)
        b_returns = returns.reshape(-1)
        b_values = values.reshape(-1)
        b_invalid_action_masks = invalid_action_masks.reshape((-1,) + invalid_action_shape)

vwxyzjn · 2022-01-12T02:57:50Z

Running an experiment here: https://wandb.ai/costa-huang/gym-microrts/runs/x9y055w6

kachayev · 2022-01-15T23:36:35Z

experiments/ppo_gridnet_league.py

+            # randomly load an opponent: fictitious self-play
+            list_of_agents = os.listdir(f"models/{experiment_name}")
+            list_of_agents.remove('agent.pt')
+            chosen_agent2pt = random.choice(list_of_agents)


Should we prune agents that have chance of winning < threshold? With the current implementation, it seems to that number of saved agent only grows over time, right? 🤔

Okay, I have much large question here. Will formulate it properly and will ask on Discord, so we can discuss 😀

Haha yeah this first implementation is very crude. OpenAI Five does it by sampling opponents probabilistically according to their trueskill.

vwxyzjn · 2022-01-19T15:38:20Z

Per discussion with @kachayev, it turns out the implementation in this repo is definitely not fictitious selfplay, which also has a supervised learning component. Instead, this PR implements what I call "Random Selfplay" where the agent plays against a random past version of self.

vwxyzjn · 2022-02-05T23:54:47Z

Closed in favor of #57

kachayev reviewed Jan 15, 2022

View reviewed changes

vwxyzjn changed the title ~~Fictitious selfplay~~ Random Selfplay Jan 19, 2022

vwxyzjn added 4 commits January 19, 2022 10:52

Prototype ficticious selfplay

45ce2a2

test changes

e580dc9

exclude data from agent2

9df6bb1

rename files to accurately reflect the algorithm

43d7e1a

vwxyzjn force-pushed the league-selfplay branch from 9a50231 to 43d7e1a Compare January 19, 2022 15:55

vwxyzjn mentioned this pull request Feb 5, 2022

Add random selfplay #57

Open

vwxyzjn closed this Feb 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Random Selfplay#35

Random Selfplay#35
vwxyzjn wants to merge 4 commits intomasterfrom
league-selfplay

vwxyzjn commented Jan 12, 2022

Uh oh!

vwxyzjn commented Jan 12, 2022

Uh oh!

kachayev Jan 15, 2022

Uh oh!

kachayev Jan 15, 2022

Uh oh!

vwxyzjn Jan 16, 2022

Uh oh!

vwxyzjn commented Jan 19, 2022

Uh oh!

vwxyzjn commented Feb 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

vwxyzjn commented Jan 12, 2022

Uh oh!

vwxyzjn commented Jan 12, 2022

Uh oh!

kachayev Jan 15, 2022

Choose a reason for hiding this comment

Uh oh!

kachayev Jan 15, 2022

Choose a reason for hiding this comment

Uh oh!

vwxyzjn Jan 16, 2022

Choose a reason for hiding this comment

Uh oh!

vwxyzjn commented Jan 19, 2022

Uh oh!

vwxyzjn commented Feb 5, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants