declearn.optimizer.modules.AdaGradModule
Bases: OptiModule
Adaptative Gradient Algorithm (AdaGrad) module.
This module implements the following algorithm:
Init(eps):
state = 0
Step(grads):
state += (grads ** 2)
grads /= (sqrt(state) + eps)
In other words, gradients (i.e. indirectly the learning rate) are scaled down by the square-root of the sum of the past squared gradients. See reference [1].
References
[1] Duchi et al., 2012. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. https://jmlr.org/papers/v12/duchi11a.html
Source code in declearn/optimizer/modules/_adaptive.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
|
__init__(eps=1e-07)
Instantiate the Adagrad gradients-adaptation module.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
eps |
float
|
Numerical-stability improvement term, added to the (divisor) adapative scaling term. |
1e-07
|
Source code in declearn/optimizer/modules/_adaptive.py
59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|