Adding a sample_action method for ActorCritic

Hello! I've been learning how to code RL form your repo. I've replace duplicating code lines from
def train
def update_policy

to agent's method self.sample_action(). And it seems that agent now solves Cart-Pole problem x2 slower(num of episodes). And it happes everytime. I have no idea what happens with torch and havn't found anything on Internet.
Can you pls help me?

https://github.com/lemikhovalex/pytorch-rl
5_tr - Proximal Policy Optimization (PPO) [CartPole]-Copy1.ipynb


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding a sample_action method for ActorCritic #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Adding a sample_action method for ActorCritic #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions