Exploiting data geometry in (matrix-valued) optimization
Many applications involve non-Euclidean data and exploiting such geometric structure in optimization can deliver algorithms that are computationally superior to standard nonlinear programming approaches. This observation has resulted in an increasing interest in Riemannian methods in the optimization and machine learning communities. In this talk, we consider the problem of optimizing Riemannian “difference of convex” functions. Several classical optimization problems involve objective functions of this form, including matrix-valued tasks, such as barycenter problems, determinantal point processes, as well as operator scaling. The latter is of central importance in many areas of Mathematics and Computer Science through connections to wide range of classical tools, including maximum likelihood estimators in Gaussian models, Tyler’s M-estimator of scatter matrices and Brascamp-Lieb constants, among others. We discuss a class of convex-concave procedures for optimizing Riemannian “difference of convex” functions, where the geometric structure of the problem gives rise to an efficiently solvable fixed-point iteration. We present a detailed convergence analysis for the proposed algorithms and provide numerical evidence for their performance. This talk is based on joint work with Suvrit Sra.