Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015;43(4):1498-1534.
doi: 10.1214/14-AOS1307.

QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Affiliations

QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Jianqing Fan et al. Ann Stat. 2015.

Abstract

We propose a novel Rayleigh quotient based sparse quadratic dimension reduction method-named QUADRO (Quadratic Dimension Reduction via Rayleigh Optimization)-for analyzing high-dimensional data. Unlike in the linear setting where Rayleigh quotient optimization coincides with classification, these two problems are very different under nonlinear settings. In this paper, we clarify this difference and show that Rayleigh quotient optimization may be of independent scientific interests. One major challenge of Rayleigh quotient optimization is that the variance of quadratic statistics involves all fourth cross-moments of predictors, which are infeasible to compute for high-dimensional applications and may accumulate too many stochastic errors. This issue is resolved by considering a family of elliptical models. Moreover, for heavy-tail distributions, robust estimates of mean vectors and covariance matrices are employed to guarantee uniform convergence in estimating non-polynomially many parameters, even though only the fourth moments are assumed. Methodologically, QUADRO is based on elliptical models which allow us to formulate the Rayleigh quotient maximization as a convex optimization problem. Computationally, we propose an efficient linearized augmented Lagrangian method to solve the constrained optimization problem. Theoretically, we provide explicit rates of convergence in terms of Rayleigh quotient under both Gaussian and general elliptical models. Thorough numerical results on both synthetic and real datasets are also provided to back up our theoretical results.

Keywords: Classification; Rayleigh quotient; dimension reduction; oracle inequality; quadratic discriminant analysis.

PubMed Disclaimer

Figures

Fig. 1
Fig. 1
An example in ℝ2. The green and purple represent class 1 and class 2, respectively. The ellipses are contours of distributions. Probability densities after being projected to X1 and X2 are also displayed. The dotted lines correspond to optimal thresholds for classification using each feature.
Fig. 2
Fig. 2
Function H(x)=Φ¯(1/x).
Fig. 3
Fig. 3
Distributions of minimum classification error based on 100 replications for four different normal models. The tuning parameters for QUADRO, SLR and L-SLR are chosen to minimize the classification errors of 4000 testing samples. See Fan et al. (2014) for detailed numerical tables.
Fig. 4
Fig. 4
Distributions of minimum classification error based on 100 replications across different elliptical distribution models. The tuning parameters for QUADRO, SLR and L-SLR are chosen to minimize the classification errors. See Fan et al. (2014) for detailed numerical tables.
Fig. 5
Fig. 5
Overall KEGG enrichment chart, using (a) QUADRO; (b) SLR.

Similar articles

Cited by

References

    1. Bickel PJ, Ritov Y, Tsybakov AB. Simultaneous analysis of lasso and Dantzig selector. Ann Statist. 2009;37:1705–1732. MR2533469.
    1. Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, Fridman WH, Pagès F, Trajanoski Z, Galon J. ClueGO: A cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25:1091–1093. - PMC - PubMed
    1. Cai T, Liu W. A direct estimation approach to sparse linear discriminant analysis. J Amer Statist Assoc. 2011;106:1566–1577. MR2896857.
    1. Cai T, Liu W, Luo X. A constrained ℓ1 minimization approach to sparse precision matrix estimation. J Amer Statist Assoc. 2011;106:594–607. MR2847973.
    1. Catoni O. Challenging the empirical mean and empirical variance: A deviation study. Ann Inst Henri Poincaré Probab Stat. 2012;48:1148–1185. MR3052407.

LinkOut - more resources