Spectral methods are a simple yet effective approach to extract information from high-dimensional data. In the context of inference from a generalized linear model (GLM), they are often used to obtain an initial estimate that can also be employed as a “warm start” for other algorithms. Specifically, in a GLM the goal is to estimate a d-dimensional signal x from an n-dimensional observation of the form f(Ax, w), where A is a design matrix and w a noise vector. Here, the spectral estimator is the principal eigenvector of a data-dependent matrix, whose spectrum exhibits a phase transition.
I will start by discussing the emergence of this phase transition for an i.i.d. Gaussian design A, and by combining spectral methods with Approximate Message Passing (AMP) algorithms, thus solving a problem related to their initialization. I will then focus on GLMs with a correlated Gaussian design, which are widely adopted in high-dimensional regression. To characterize spectral estimators in this challenging setup, I will propose a novel approach based on AMP: this allows us to systematically characterize key spectral properties, such as the location of outlier eigenvalues and the overlap between top eigenvectors and unknown informative components. I will conclude by showing the generality of this technique via an application to matrix denoising with doubly heteroscedastic noise.
Refreshments will be served preceding the talk, beginning at 2:45.