Description • Train a network • Results • References
SR2 is an optimizer that trains deep neural networks with nonsmooth and non convex regularizations to retrieve a sparse and efficient sub-structure.
The optimizer minimizes a the sum of a finite-sum loss function
with an adaptive proximal quadratic regularization scheme.
Supported regularizers are
-
$||x||_0$ is the number of non zero$x_i$ -
$||x||_p = (\sum_i |x_i|^p)^{\frac{1}{p}}$ for$p = \frac{1}{2}, \frac{2}{3}, 1$
- Numpy
- Pytorch
- PyHessian [https://github.com/amirgholami/PyHessian]
You can start training the network by running a command similar to
python main.py --reg=l0 --precond=andrei --beta=0.95 --lam=0.001
The following table gives a summary of the options and a brief description:
SR2 option | Description | Possible values |
---|---|---|
--lam |
|
|
--reg | Regularization function |
l0, l1, l12, l23 |
--beta | Momentum factor | |
--precond | Choice of preconditioner to accelerate training | none, adam, andrei* |
--max_epoch | Number of training epochs | |
--wd | Weight decay | |
--seed | Random seed |
@misc{https://doi.org/10.48550/arxiv.2206.06531,
doi = {10.48550/ARXIV.2206.06531},
url = {https://arxiv.org/abs/2206.06531},
author = {Lakhmiri, Dounia and Orban, Dominique and Lodi, Andrea},
keywords = {Machine Learning (stat.ML), Machine Learning (cs.LG), Optimization and Control (math.OC), FOS: Computer and information sciences, FOS: Computer and information sciences, FOS: Mathematics, FOS: Mathematics},
title = {A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum Optimization},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution Share Alike 4.0 International}
}