Exploiting geometric structure in (matrix-valued) optimization
Many applications involve non-Euclidean data and exploiting such geometric structure in optimization can deliver algorithms that are computationally superior to standard nonlinear programming approaches. This observation has resulted in an increasing interest in Riemannian methods in the optimization and machine learning communities. In this talk, we consider the problem of optimizing Riemannian “difference of convex” functions. Several classical optimization problems involve objective functions of this form, including matrix-valued tasks, such as barycenter problems, the computation of HPD matrix square roots, as well as the computation of Brascamp-Lieb constants. The latter is of central importance in many areas of Mathematics and Computer Science through connections to maximum likelihood estimators in Gaussian models, Tyler’s M-estimator of scatter matrices and Operator Scaling. We discuss a class of CCCP-style algorithms for solving such Riemannian “difference of convex” functions, where the geometric structure of the problem gives rise to an efficiently solvable fixed-point iteration. We present a detailed convergence analysis for the proposed algorithms and provide numerical evidence for their performance. This talk is based on joint work with Suvrit Sra.