Dual Space Preconditioning for Gradient Descent

Christopher Maddison, Daniel Paulin, Arnaud Doucet, Yee Whye The

Research output: Contribution to journalArticlepeer-review

Abstract

The conditions of relative smoothness and relative strong convexity were recently introduced for the analysis of Bregman gradient methods for convex optimization. We introduce a generalized left-preconditioning method for gradient descent and show that its convergence on an essentially smooth convex objective function can be guaranteed via an application of relative smoothness in the dual space. Our relative smoothness assumption is between the designed preconditioner and the convex conjugate of the objective, and it generalizes the typical Lipschitz gradient assumption. Under dual relative strong convexity, we obtain linear convergence with a generalized condition number that is invariant under horizontal translations, distinguishing it from Bregman gradient methods. Thus, in principle our method is capable of improving the conditioning of gradient descent on problems with a non-Lipschitz gradient or nonstrongly convex structure. We demonstrate our method on $p$-norm regression and exponential penalty function minimization.
Original languageEnglish
Pages (from-to)991-1016
Number of pages26
JournalSiam journal on optimization
Volume31
Issue number1
DOIs
Publication statusPublished - 30 Mar 2021

Fingerprint

Dive into the research topics of 'Dual Space Preconditioning for Gradient Descent'. Together they form a unique fingerprint.

Cite this