-
Notifications
You must be signed in to change notification settings - Fork 398
Add adam train #159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add adam train #159
Conversation
bukka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition in general. Thanks.
bukka
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added some comments to use correct format for fann_type. But thinking about it, maybe it would be just better to always keep it as float (which is what is done for other algs) - it won't really make much difference and will be consistent with other algs.
|
@wu1045718093 I actually meant that after re-considering it, it might be better to define adam_beta1/beta2/epsilon as float and then you don't need to use the special format that I suggested before. The problem is mainly fixed fann if you use fann_type (then the float defaults that you use don't work). Also params for other algs as rprop are defined as float so it would be also more consistent. |
Previously, when copying a network mid-training with the Adam optimizer, the moment vectors (adam_m and adam_v) were set to NULL while adam_timestep was preserved. This caused an inconsistency where the copied network had a non-zero timestep but zeroed moments, potentially leading to suboptimal training behavior when continuing training on the copied network. This fix ensures Adam optimizer state is fully preserved during network copy operations, consistent with how other optimizers (Quickprop, RProp) handle their training state. The moment vectors are now deep-copied along with the timestep, allowing seamless continuation of training. Changes: - Add deep copy of adam_m (first moment vector) if it exists - Add deep copy of adam_v (second moment vector) if it exists - Maintain backward compatibility for networks without training state - Follow the same pattern as other optimizer state variables
…work files Add fann_scanf_optional macro to support optional parameter loading with default values. This maintains backward compatibility when loading network files saved by older versions that don't have Adam optimizer parameters (adam_beta1, adam_beta2, adam_epsilon). Changes: - Add fann_scanf_optional macro that falls back to default value if parameter is not found in the file - Make Adam parameters optional during loading (adam_beta1=0.9, adam_beta2=0.999, adam_epsilon=1e-8) - Preserve file position on failed parameter read using ftell/fseek This ensures that networks saved before Adam support was added can still be loaded successfully by newer versions of the library.
- Add adam_beta1, adam_beta2, and adam_epsilon to fann_print_parameters() - Update adam_epsilon output format to %.8f for higher precision
Add robot_adam executable and robot_adam.net to examples/.gitignore
|
@bukka I have removed the changes regarding type modification and verified that fann_test runs correctly |
|
It looks good. I will properly check it out once more on Monday and if I don't find anything major, I will merge it. |
No description provided.