geodesic convexity, S-Divergence, matrix square root, nonconvex optimization
This paper is triggered by the preprint [P. Jain, C. Jin, S.M. Kakade, and P. Netrapalli. Computing matrix squareroot via non convex local search. Preprint, arXiv:1507.05854, 2015.], which analyzes gradient-descent for computing the square root of a positive definite matrix. Contrary to claims of Jain et al., the author’s experiments reveal that Newton-like methods compute matrix square roots rapidly and reliably, even for highly ill-conditioned matrices and without requiring com-mutativity. The author observes that gradient-descent converges very slowly primarily due to tiny step-sizes and ill-conditioning. The paper derives an alternative first-order method based on geodesic convexity; this method admits a transparent convergence analysis (< 1 page), attains linear rate, and displays reliable convergence even for rank deficient problems. Though superior to gradient-descent, ultimately this method is also outperformed by a well-known scaled Newton method. Nevertheless, the primary value of the paper is conceptual: it shows that for deriving gradient based methods for the matrix square root, the manifold geometric view of positive definite matrices can be much more advantageous than the Euclidean view.
"On the Matrix Square Root via Geometric Optimization",
Electronic Journal of Linear Algebra,
Volume 31, pp. 433-443.