You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In 'lets_start.py' you are evaluating a model after every 'evaluation_period' number of epochs. But you are also running the operation 'step' during evaluation (line 44 in lets_start.py) which will also update the model weights rather than just using the learned weights to evaluate. Why is this done?. Please clarify if I am missing something.
In the file 'model.py' at line 15, you are creating the discount factor array (self.df) and using it only for loss type 'naive' in the 'loss_func' (line 89) and not for loss type 'oi'. This is different from the paper as you mention in equation 9 in your paper that you are using it. Is there any reason for not using it in the code for 'oi' loss type?. Please clarify.
In 'lets_start.py' you are evaluating a model after every 'evaluation_period' number of epochs. But you are also running the operation 'step' during evaluation (line 44 in lets_start.py) which will also update the model weights rather than just using the learned weights to evaluate. Why is this done?. Please clarify if I am missing something.
In the file 'model.py' at line 15, you are creating the discount factor array (self.df) and using it only for loss type 'naive' in the 'loss_func' (line 89) and not for loss type 'oi'. This is different from the paper as you mention in equation 9 in your paper that you are using it. Is there any reason for not using it in the code for 'oi' loss type?. Please clarify.