State-of-the-Art SVD for Big Data - Institute of Applied Mathematics

The singular value decomposition (SVD) is one of the core computations of today’s scientific applications and data analysis tools. The main goal is to compute a compact representation of a high dimensional operator, a matrix, or a set of data that best resembles the original in its most important features. Thus, SVD is widely used in scientific computing and machine learning, including low rank factorizations, graph learning, unsupervised learning, compression and analysis of images and text.

The popularity of the SVD has resulted in an increased diversity of methods and implementations that exploit specific features of the input data (e.g., dense/sparse matrix, data distributed among the computing devices, data from queries or batch access, spectral decay) and certain constraints on the computed solutions (e.g., few/many number of singular values and singular vectors computed, targeted part of the spectrum, accuracy). The use of the proper method and the customization of the settings can significantly reduce the cost.

In this talk, we’ll overview the most relevant methods in terms of computing cost and accuracy (direct methods, iterative methods, online methods), including the most recent advances in randomized and online SVD solvers. We present what parameters have the biggest impact on the computational cost and the quality of the solution, and some intuition for their tuning. Finally, we discuss the current state of the software on widely used platforms (MATLAB, Python’s numpy/scipy and R) as well as high-performance solvers with support for multicore, GPU, and distributed memory.