Batch ensemble ddpg by runjerry · Pull Request #1633 · HorizonRobotics/alf

runjerry · 2024-03-28T07:37:19Z

Update actor_network, critic_network, and ddpg_algorithm to work with batch_ensemble layers.

…and update unittests

Haichao-Zhang · 2024-03-30T00:43:11Z

alf/algorithms/ddpg_algorithm.py

                gradient dqda element-wise between ``[-dqda_clipping, dqda_clipping]``.
                Does not perform clipping if ``dqda_clipping == 0``.
            action_l2 (float): weight of squared action l2-norm on actor loss.
+            use_batch_ensemble (bool): whether to use BatchEnsemble FC and Conv2D


Ideally, we might should make these batch ensemble related parameters transparent to the ddpg_algorithm? Basically, the ddpg_algorithm should not use batch_ensemble related parameters in the ideal case.

That's a good point. Currently ddpg needs the use_batch_ensemble to do some post processing when forwarding critic networks during training. Let me think it over if there might be some alternative methods to work around.

emailweixu · 2024-04-03T16:18:39Z

alf/algorithms/ddpg_algorithm.py

                               pred_step.output)
-        return pred_step
+
+        if self.need_full_rollout_state():


We want the algorithm use the same ensemble_id during an entire episode. This means that it should store ensembled_id in state and use the same ensemble_id to call actor_network

Oh yes, good point, I think that is the reason why I had to tweak the ddpg_algorithm_test to pass the toy unittest. Updated.

runjerry added 2 commits March 27, 2024 22:10

make actor and critic network incorporate with batchensemble options …

1c14fc5

…and update unittests

update ddpg_algorithm to work with batchensemble and fix unittests

ad5f40e

runjerry force-pushed the batch_ensemble_ddpg branch from 8930ae0 to ad5f40e Compare March 28, 2024 07:48

runjerry added 2 commits March 29, 2024 14:18

support full_rollout_state and target_update_period in ddpg algorithm

dae6012

add input_preprocessors_ctor option to ActorNetwork

d0d82d4

runjerry marked this pull request as ready for review March 29, 2024 22:02

runjerry requested review from Haichao-Zhang and emailweixu March 29, 2024 22:02

Haichao-Zhang reviewed Mar 30, 2024

View reviewed changes

emailweixu reviewed Apr 3, 2024

View reviewed changes

address code reviews to use same ensemble_ids for ddpg during rollout

8d31c76

runjerry requested review from Haichao-Zhang and emailweixu August 2, 2024 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch ensemble ddpg#1633

Batch ensemble ddpg#1633
runjerry wants to merge 5 commits intopytorchfrom
batch_ensemble_ddpg

runjerry commented Mar 28, 2024

Uh oh!

Haichao-Zhang Mar 30, 2024

Uh oh!

runjerry Mar 30, 2024

Uh oh!

emailweixu Apr 3, 2024

Uh oh!

runjerry Apr 3, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

runjerry commented Mar 28, 2024

Uh oh!

Haichao-Zhang Mar 30, 2024

Choose a reason for hiding this comment

Uh oh!

runjerry Mar 30, 2024

Choose a reason for hiding this comment

Uh oh!

emailweixu Apr 3, 2024

Choose a reason for hiding this comment

Uh oh!

runjerry Apr 3, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants