Z. Allen-zhu, Katyusha: The First Direct Acceleration of Stochastic Gradient Methods, Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing. STOC, pp.1200-1205, 2017.

Z. Allen-zhu and E. Hazan, Variance Reduction for Faster Non-Convex Optimization, Proceedings of The 33rd International Conference on Machine Learning, vol.48, pp.699-707, 2016.

A. Asuncion and D. Newman, UCI machine learning repository, 2007.

C. Chang and C. Lin, LIBSVM: A library for support vector machines, ACM transactions on intelligent systems and technology (TIST), vol.2, p.27, 2011.

A. Defazio, F. Bach, and S. Lacoste-julien, SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives, Advances in Neural Information Processing Systems 27, pp.1646-1654, 2014.
URL : https://hal.archives-ouvertes.fr/hal-01016843

N. Gazagnadou, R. M. Gower, and J. Salmon, Optimal mini-batch and step sizes for SAGA, The International Conference on Machine Learning, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02005431

R. M. Gower, N. Loizou, X. Qian, A. Sailanbayev, E. Shulgin et al., SGD: general analysis and improved rates
URL : https://hal.archives-ouvertes.fr/hal-02365318

R. M. Gower, P. Richtárik, and F. Bach, Stochastic Quasi-Gradient methods: Variance Reduction via Jacobian Sketching, 2018.

R. Harikandeh, M. O. Ahmed, A. Virani, M. Schmidt, J. Kone?ný et al., StopWasting My Gradients: Practical SVRG, Advances in Neural Information Processing Systems 28, pp.2251-2259, 2015.

S. Horváth and P. Richtárik, Nonconvex Variance Reduced Optimization with Arbitrary Sampling

R. Johnson and T. Zhang, Accelerating stochastic gradient descent using predictive variance reduction, Advances in Neural Information Processing Systems, pp.315-323, 2013.

J. Kone?ný and P. Richtárik, Semi-stochastic gradient descent methods, Frontiers in Applied Mathematics and Statistics, vol.3, p.9, 2017.

J. Kone?ný, J. Liu, P. Richtárik, and M. Taká?, Mini-Batch Semi-Stochastic Gradient Descent in the Proximal Setting, IEEE Journal of Selected Topics in Signal Processing, vol.2, pp.242-255, 2016.

D. Kovalev, S. Horvath, and P. Richtárik, Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop, 2019.

T. Murata and T. Suzuki, Doubly Accelerated Stochastic Variance Reduced Dual Averaging Method for Regularized Empirical Risk Minimization, Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS'17, pp.608-617, 2017.

Y. Nesterov, Introductory lectures on convex optimization: A basic course, vol.87, 2013.

Y. Nesterov and J. Vial, Confidence level solutions for stochastic programming, In: Automatica, vol.44, pp.1559-1568, 2008.

L. M. Nguyen, J. Liu, K. Scheinberg, and M. Taká?, SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient, Proceedings of the 34th International Conference on Machine Learning, vol.70, pp.2613-2621, 2017.

A. Nitanda, Stochastic Proximal Gradient Descent with Acceleration Techniques, Advances in Neural Information Processing Systems 27, pp.1574-1582, 2014.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion et al., Scikit-learn: Machine Learning in Python, Journal of Machine Learning Research, vol.12, pp.2825-2830, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00650905

S. J. Reddi, A. Hefny, S. Sra, B. Pczos, and A. J. Smola, Stochastic Variance Reduction for Nonconvex Optimization, Proceedings of the 34th International Conference on Machine Learning, vol.48, pp.314-323, 2016.

H. Robbins and D. Siegmund, A convergence theorem for non negative almost supermartingales and some applications, pp.111-135, 1985.

N. L. Roux, M. Schmidt, and F. R. Bach, A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets, Advances in Neural Information Processing Systems 25, pp.2663-2671, 2012.
URL : https://hal.archives-ouvertes.fr/hal-00674995

S. Shalev-shwartz and T. Zhang, Stochastic dual coordinate ascent methods for regularized loss minimization, Journal of Machine Learning Research, vol.14, pp.567-599, 2013.