Exploiting geometric structure in matrix-valued optimization
Matrix-valued optimization tasks arise in many machine learning applications. Often, exploiting non-Euclidean structure in such problems can give rise to algorithms that are computationally superior to standard nonlinear programming approaches. In this talk, we consider the problem of optimizing a function on a (Riemannian) manifold subject to convex constraints. Several classical problems can be phrased as constrained optimization on matrix manifolds. This includes barycenter problems, as well as the computation of Brascamp-Lieb constants. The latter is of central importance in many areas of mathematics and computer science through connections to maximum likelihood estimators in Gaussian models, Tyler’s M-estimator of scatter matrices and operator scaling. We introduce Riemannian Frank-Wolfe methods, a class of first-order methods for solving constrained optimization problems on manifolds and present a global, non-asymptotic convergence analysis. We further discuss a class of CCCP-style algorithms for Riemannian “difference of convex” functions and explore connections to constrained optimization. We complement our discussion with applications to the two problems described above. Based on joint work with Suvrit Sra.