QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Jianqing Fan¹, Zheng Tracy Ke², Han Liu¹, Lucy Xia¹

Affiliations

PMID: 26778864
PMCID: PMC4712455
DOI: 10.1214/14-AOS1307

QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Jianqing Fan et al. Ann Stat. 2015.

. 2015;43(4):1498-1534.

doi: 10.1214/14-AOS1307.

Authors

Jianqing Fan¹, Zheng Tracy Ke², Han Liu¹, Lucy Xia¹

Affiliations

¹ Princeton University.
² University of Chicago.

PMID: 26778864
PMCID: PMC4712455
DOI: 10.1214/14-AOS1307

Abstract

We propose a novel Rayleigh quotient based sparse quadratic dimension reduction method-named QUADRO (Quadratic Dimension Reduction via Rayleigh Optimization)-for analyzing high-dimensional data. Unlike in the linear setting where Rayleigh quotient optimization coincides with classification, these two problems are very different under nonlinear settings. In this paper, we clarify this difference and show that Rayleigh quotient optimization may be of independent scientific interests. One major challenge of Rayleigh quotient optimization is that the variance of quadratic statistics involves all fourth cross-moments of predictors, which are infeasible to compute for high-dimensional applications and may accumulate too many stochastic errors. This issue is resolved by considering a family of elliptical models. Moreover, for heavy-tail distributions, robust estimates of mean vectors and covariance matrices are employed to guarantee uniform convergence in estimating non-polynomially many parameters, even though only the fourth moments are assumed. Methodologically, QUADRO is based on elliptical models which allow us to formulate the Rayleigh quotient maximization as a convex optimization problem. Computationally, we propose an efficient linearized augmented Lagrangian method to solve the constrained optimization problem. Theoretically, we provide explicit rates of convergence in terms of Rayleigh quotient under both Gaussian and general elliptical models. Thorough numerical results on both synthetic and real datasets are also provided to back up our theoretical results.

Keywords: Classification; Rayleigh quotient; dimension reduction; oracle inequality; quadratic discriminant analysis.

PubMed Disclaimer

Figures

**Fig. 1**
An example in ℝ². The green and purple represent class 1 and class 2, respectively. The ellipses are contours of distributions. Probability densities after being projected to X₁ and X₂ are also displayed. The dotted lines correspond to optimal thresholds for classification using each feature.

**Fig. 2**
Function $H (x) = \bar{Φ} (1 / \sqrt{x})$ .

**Fig. 3**
Distributions of minimum classification error based on 100 replications for four different normal models. The tuning parameters for QUADRO, SLR and L-SLR are chosen to minimize the classification errors of 4000 testing samples. See Fan et al. (2014) for detailed numerical tables.

**Fig. 4**
Distributions of minimum classification error based on 100 replications across different elliptical distribution models. The tuning parameters for QUADRO, SLR and L-SLR are chosen to minimize the classification errors. See Fan et al. (2014) for detailed numerical tables.

**Fig. 5**
Overall KEGG enrichment chart, using (a) QUADRO; (b) SLR.

See this image and copyright information in PMC

Cited by

LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS.
Fan J, Liu H, Wang W. Fan J, et al. Ann Stat. 2018 Aug;46(4):1383-1414. doi: 10.1214/17-AOS1588. Epub 2018 Jun 27. Ann Stat. 2018. PMID: 30214095 Free PMC article.
Predictive overfitting in immunological applications: Pitfalls and solutions.
Gygi JP, Kleinstein SH, Guan L. Gygi JP, et al. Hum Vaccin Immunother. 2023 Aug 1;19(2):2251830. doi: 10.1080/21645515.2023.2251830. Hum Vaccin Immunother. 2023. PMID: 37697867 Free PMC article. Review.
Environmental factors influencing biological rhythms in newborns: From neonatal intensive care units to home.
Bueno C, Menna-Barreto L. Bueno C, et al. Sleep Sci. 2016 Oct-Dec;9(4):295-300. doi: 10.1016/j.slsci.2016.10.004. Epub 2017 Jan 7. Sleep Sci. 2016. PMID: 28154744 Free PMC article.
Exploring patterns enriched in a dataset with contrastive principal component analysis.
Abid A, Zhang MJ, Bagaria VK, Zou J. Abid A, et al. Nat Commun. 2018 May 30;9(1):2134. doi: 10.1038/s41467-018-04608-8. Nat Commun. 2018. PMID: 29849030 Free PMC article.
Multitask Quantile Regression under the Transnormal Model.
Fan J, Xue L, Zou H. Fan J, et al. J Am Stat Assoc. 2016;111(516):1726-1735. doi: 10.1080/01621459.2015.1113973. Epub 2017 Jan 5. J Am Stat Assoc. 2016. PMID: 29097827 Free PMC article.

References

1. Bickel PJ, Ritov Y, Tsybakov AB. Simultaneous analysis of lasso and Dantzig selector. Ann Statist. 2009;37:1705–1732. MR2533469.
1. Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, Fridman WH, Pagès F, Trajanoski Z, Galon J. ClueGO: A cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25:1091–1093. - PMC - PubMed
1. Cai T, Liu W. A direct estimation approach to sparse linear discriminant analysis. J Amer Statist Assoc. 2011;106:1566–1577. MR2896857.
1. Cai T, Liu W, Luo X. A constrained ℓ1 minimization approach to sparse precision matrix estimation. J Amer Statist Assoc. 2011;106:594–607. MR2847973.
1. Catoni O. Challenging the empirical mean and empirical variance: A deviation study. Ann Inst Henri Poincaré Probab Stat. 2012;48:1148–1185. MR3052407.

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Affiliations

QUADRO: A SUPERVISED DIMENSION REDUCTION METHOD VIA RAYLEIGH QUOTIENT OPTIMIZATION

Authors

Affiliations

Abstract

Figures

Similar articles

Cited by

References

Grants and funding

LinkOut - more resources

Full Text Sources