In this post, I would like to share my implementation of several famous first-order optimization methods. I know that these methods have been implemented very well in many packages, but I hope my implementation can help you understand the ideas behind it.
Suppose we have \(N\) data examples and the parameters \(\mathbf{w} \in \mathcal{R}^D\).
For convenience, I first write a class named optimizer.
class optimizer: def __init__(self): pass def set_param(self, parameters): self.