Skip to content

declearn.optimizer.modules.AdaGradModule

Bases: OptiModule

Adaptative Gradient Algorithm (AdaGrad) module.

This module implements the following algorithm:

Init(eps):
    state = 0
Step(grads):
    state += (grads ** 2)
    grads /= (sqrt(state) + eps)

In other words, gradients (i.e. indirectly the learning rate) are scaled down by the square-root of the sum of the past squared gradients. See reference [1].

References

[1] Duchi et al., 2012. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. https://jmlr.org/papers/v12/duchi11a.html

Source code in declearn/optimizer/modules/_adaptive.py
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
class AdaGradModule(OptiModule):
    """Adaptative Gradient Algorithm (AdaGrad) module.

    This module implements the following algorithm:

        Init(eps):
            state = 0
        Step(grads):
            state += (grads ** 2)
            grads /= (sqrt(state) + eps)

    In other words, gradients (i.e. indirectly the learning rate)
    are scaled down by the square-root of the sum of the past
    squared gradients. See reference [1].

    References
    ----------
    [1] Duchi et al., 2012.
        Adaptive Subgradient Methods for Online Learning
        and Stochastic Optimization.
        https://jmlr.org/papers/v12/duchi11a.html
    """

    name: ClassVar[str] = "adagrad"

    def __init__(
        self,
        eps: float = 1e-7,
    ) -> None:
        """Instantiate the Adagrad gradients-adaptation module.

        Parameters
        ----------
        eps: float, default=1e-7
            Numerical-stability improvement term, added
            to the (divisor) adapative scaling term.
        """
        self.eps = eps
        self.state = 0.0  # type: Union[Vector, float]

    def get_config(
        self,
    ) -> Dict[str, Any]:
        return {"eps": self.eps}

    def run(
        self,
        gradients: Vector,
    ) -> Vector:
        self.state = self.state + (gradients**2)
        scaling = (self.state**0.5) + self.eps
        return gradients / scaling

    def get_state(
        self,
    ) -> Dict[str, Any]:
        return {"state": self.state}

    def set_state(
        self,
        state: Dict[str, Any],
    ) -> None:
        if "state" not in state:
            raise KeyError("Missing required state variable 'state'.")
        self.state = state["state"]

__init__(eps=1e-07)

Instantiate the Adagrad gradients-adaptation module.

Parameters:

Name Type Description Default
eps float

Numerical-stability improvement term, added to the (divisor) adapative scaling term.

1e-07
Source code in declearn/optimizer/modules/_adaptive.py
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def __init__(
    self,
    eps: float = 1e-7,
) -> None:
    """Instantiate the Adagrad gradients-adaptation module.

    Parameters
    ----------
    eps: float, default=1e-7
        Numerical-stability improvement term, added
        to the (divisor) adapative scaling term.
    """
    self.eps = eps
    self.state = 0.0  # type: Union[Vector, float]