Optimization in High Dimensions via Accelerated, Parallel and Proximal Coordinate Descent

Olivier Fercoq; Peter Richtárik

Chapitre D'ouvrage Année : 2016

Optimization in High Dimensions via Accelerated, Parallel and Proximal Coordinate Descent

(1, 2) , (3)

1
2
3

Olivier Fercoq

Fonction : Auteur
PersonId : 178780
IdHAL : olivier-fercoq
ORCID : 0000-0002-3393-9757
IdRef : 232918333

Signal, Statistique et Apprentissage

Département Traitement du Signal et des Images

Peter Richtárik

Fonction : Auteur

School of Mathematics - University of Edinburgh

Résumé

We propose a new randomized coordinate descent method for minimizing the sum of convex functions each of which depends on a small number of coordinates only. Our method (APPROX) is simultaneously Accelerated, Parallel and PROXimal; this is the first time such a method is proposed. In the special case when the number of processors is equal to the number of coordinates, the method converges at the rate $2\bar{\omega}\bar{L} R^2/(k+1)^2 $, where $k$ is the iteration counter, $\bar{\omega}$ is a data-weighted \emph{average} degree of separability of the loss function, $\bar{L}$ is the \emph{average} of Lipschitz constants associated with the coordinates and individual functions in the sum, and $R$ is the distance of the initial point from the minimizer. We show that the method can be implemented without the need to perform full-dimensional vector operations, which is the major bottleneck of accelerated coordinate descent, rendering it impractical. The fact that the method depends on the average degree of separability, and not on the maximum degree, can be attributed to the use of new safe large stepsizes, leading to improved expected separable overapproximation (ESO). These are of independent interest and can be utilized in all existing parallel randomized coordinate descent algorithms based on the concept of ESO. In special cases, our method recovers several classical and recent algorithms such as simple and accelerated proximal gradient descent, as well as serial, parallel and distributed versions of randomized block coordinate descent. \new{Due of this flexibility, APPROX had been used successfully by the authors in a graduate class setting as a modern introduction to deterministic and randomized proximal gradient methods. Our bounds match or improve on the best known bounds for each of the methods APPROX specializes to. Our method has applications in a number of areas, including machine learning, submodular optimization, linear and semidefinite programming.

Domaines

Optimisation et contrôle [math.OC]

TelecomParis HAL : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-02287359

Soumis le : vendredi 13 septembre 2019-16:53:59

Dernière modification le : jeudi 14 mars 2024-03:12:55

Dates et versions

hal-02287359 , version 1 (13-09-2019)

Identifiants

HAL Id : hal-02287359 , version 1

Citer

Olivier Fercoq, Peter Richtárik. Optimization in High Dimensions via Accelerated, Parallel and Proximal Coordinate Descent. SIAM Review, SIAM, 2016. ⟨hal-02287359⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS PARISTECH TDS-MACS UNIV-PARIS-SACLAY LTCI IDS S2A

36 Consultations

0 Téléchargements

Optimization in High Dimensions via Accelerated, Parallel and Proximal Coordinate Descent

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager