Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun;48(3):1452-1474.
doi: 10.1214/19-aos1854. Epub 2020 Jul 17.

ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK

Affiliations

ENTRYWISE EIGENVECTOR ANALYSIS OF RANDOM MATRICES WITH LOW EXPECTED RANK

Emmanuel Abbe et al. Ann Stat. 2020 Jun.

Abstract

Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as in factor analysis, community detection, ranking, matrix completion, among others. While a large variety of bounds are available for average errors between empirical and population statistics of eigenvectors, few results are tight for entrywise analyses, which are critical for a number of problems such as community detection. This paper investigates entrywise behaviors of eigenvectors for a large class of random matrices whose expectations are low-rank, which helps settle the conjecture in Abbe et al. (2014b) that the spectral algorithm achieves exact recovery in the stochastic block model without any trimming or cleaning steps. The key is a first-order approximation of eigenvectors under the norm: u k A u k * λ k * , where {u k } and { u k * } are eigenvectors of a random matrix A and its expectation E A , respectively. The fact that the approximation is both tight and linear in A facilitates sharp comparisons between u k and u k * . In particular, it allows for comparing the signs of u k and u k * even if u k - u k * is large. The results are further extended to perturbations of eigenspaces, yielding new -type bounds for synchronization ( 2 -spiked Wigner model) and noisy matrix completion.

Keywords: 62H12; Primary 62H25; community detection; eigenvector perturbation; low-rank structures; matrix completion; random matrices; secondary 60B20; spectral analysis; synchronization.

PubMed Disclaimer

Figures

Fig 1:
Fig 1:
The second eigenvector and its first-order approximation in SBM. Left: The histogram of coordinates of nu2 computed from a single realization of adjacency matrix A, where n is 5000, a is 4.5 and b is 0.25. Exact recovery is expected as coordinates form two well-separated clusters. Right: boxplots showing three different distance/errors (up to sign) over 100 realizations: (i) nu2u2*, (ii) nAu2*/λ2*u2*, (iii) nu2Au2*/λ2*Au2*/λ2* is a good approximation of u2 under norm even though u2u2* may be large.
Fig 2:
Fig 2:
Typical choices of φ for Gaussian noise and Bernoulli noise.
Fig 3:
Fig 3:
Error decay in power iterations. The larger and smaller squares represent balls centered at u¯, with radii u0u¯ and u1u¯, respectively.
Fig 4:
Fig 4:
Eigen-gap Δ*
Fig 5:
Fig 5:
Phase transition of 2-synchronization: the x-axis is the dimension n, and the y-axis is σ. Lighter pixels refer to higher proportions of runs that x^ recovers x. The red curve shows the theoretical boundary σ=n2logn.
Fig 6:
Fig 6:
Vanilla spectral method for SBM. Left: phase transition of exact recovery. The x-axis is b, the y-axis is a, and lighter pixels represent higher chances of success. Two red curves ab=±2 represent theoretical boundaries for phase transtion, matched by numerical results. Right: mean misclassification rates on the logarithmic scale with b = 2. The x-axis is a, varying from 2 to 8, and the y-axis is logEr(z^,z)/logn. No marker: theoretical curve; circles: n = 5000; crosses: n = 500; squares: n = 100.
Fig 7:
Fig 7:
Rmat and Rvec in matrix completion from noisy entries. The x-axis is n, varying from 500 to 5000 by 500, and the y-axis is the ratio. Crosses and circles stand for Rmat and Rvec, respectively.

References

    1. Abbe E (2017). Community detection and stochastic block models: recent developments. arXiv preprint arXiv:1703.10146.
    1. Abbe E, Bandeira AS, Bracher A and Singer A (2014a). Decoding binary node labels from censored edge measurements: Phase transition and efficient recovery. IEEE Transactions on Network Science and Engineering 1 10–22.
    1. Abbe E, Bandeira AS and Hall G (2014b). Exact recovery in the stochastic block model. arXiv preprint arXiv:1405.3267.
    1. Abbe E, Bandeira AS and Hall G (2016). Exact recovery in the stochastic block model. IEEE Transactions on Information Theory 62 471–487.
    1. Abbe E, Fan J, Wang K and Zhong Y (2018). Supplement to “entrywise eigenvector analysis of random matrices with low expected rank”. Submitted to the Annals of Statistics. - PMC - PubMed

LinkOut - more resources