SEGA: Variance Reduction via Gradient Sketching

Filip Hanzely, Konstantin Mishchenko, Peter Richtarik

Research output: Contribution to conferencePosterpeer-review

Abstract

We propose a randomized first order optimization method—SEGA (SkEtched
GrAdient)—which progressively throughout its iterations builds a variancereduced
estimate of the gradient from random linear measurements (sketches) of the gradient. In each iteration, SEGA updates the current estimate of the gradient
through a sketch-and-project operation using the information provided by the
latest sketch, and this is subsequently used to compute an unbiased estimate of
the true gradient through a random relaxation procedure. This unbiased estimate
is then used to perform a gradient step. Unlike standard subspace descent methods, such as coordinate descent, SEGA can be used for optimization problems with a non-separable proximal term. We provide a general convergence analysis and prove linear convergence for strongly convex objectives. In the special case of coordinate sketches, SEGA can be enhanced with various techniques such as importance sampling, minibatching and acceleration, and its rate is up to a small constant factor identical to the best-known rate of coordinate descent.
Original languageEnglish
Number of pages12
Publication statusAccepted/In press - 5 Sep 2018
EventThirty-second Conference on Neural Information Processing Systems - Montreal, Canada
Duration: 3 Dec 20188 Dec 2018
https://nips.cc/

Conference

ConferenceThirty-second Conference on Neural Information Processing Systems
Abbreviated titleNIPS 2018
CountryCanada
CityMontreal
Period3/12/188/12/18
Internet address

Fingerprint Dive into the research topics of 'SEGA: Variance Reduction via Gradient Sketching'. Together they form a unique fingerprint.

Cite this