Taking 'done' into consideration while calculating returns

Hello, thank you for making this repo, 
I think while calculating the returns you should take done into consideration as,
```

    def calculate_returns(self, rewards, dones, normalize = True):
       
        returns = []
        R = 0
        for r, d in zip(reversed(rewards), reversed(dones)):    
            if d:
                R = 0
            R = r + R * self.gamma
            returns.insert(0, R)
            
        returns = torch.tensor(returns).to(device)
        
        if normalize:
            returns = (returns - returns.mean()) / returns.std()
            
        return returns
```

Also can you please briefly describe the Generalized Advantage Estimation (GAE) while calculating the advantages.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Taking 'done' into consideration while calculating returns #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Taking 'done' into consideration while calculating returns #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions