Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2019 Jan:169:278-299.
doi: 10.1016/j.jmva.2018.09.011. Epub 2018 Oct 3.

Sparse quadratic classification rules via linear dimension reduction

Affiliations

Sparse quadratic classification rules via linear dimension reduction

Irina Gaynanova et al. J Multivar Anal. 2019 Jan.

Abstract

We consider the problem of high-dimensional classification between two groups with unequal covariance matrices. Rather than estimating the full quadratic discriminant rule, we propose to perform simultaneous variable selection and linear dimension reduction on the original data, with the subsequent application of quadratic discriminant analysis on the reduced space. In contrast to quadratic discriminant analysis, the proposed framework doesn't require the estimation of precision matrices; it scales linearly with the number of measurements, making it especially attractive for the use on high-dimensional datasets. We support the methodology with theoretical guarantees on variable selection consistency, and empirical comparisons with competing approaches. We apply the method to gene expression data of breast cancer patients, and confirm the crucial importance of the ESR1 gene in differentiating estrogen receptor status.

Keywords: Convex optimization; Discriminant analysis; High-dimensional statistics; Variable selection.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Two-group classification problem with p = 2 and unequal covariance matrices. Left: Projection using Fisher’s discriminant vector. Middle: Projection using the covariance structure from the 1st group (circles). Right: Projection using the covariance structure from the 2nd group (triangles).
Figure 2:
Figure 2:
Misclassification error rates over 100 replications, the horizontal lines show the median errors of the proposed DAP, discriminant analysis via projections. SLDA: Sparse linear discriminant analysis; SLOG: Sparse logistic regression with interactions; SQDA_LH: Sparse QDA of Le and Hastie [30]; SQDA_LS: Sparse QDA of Li and Shao [31]; SQDA_RF: Sparse QDA via ridge fusion; RDA: Regularized discriminant analysis.
Figure 3:
Figure 3:
Number of selected variables over 100 replications, the horizontal lines indicate the median model sizes of proposed DAP, discriminant analysis via projections. RDA, SQDA_RF and SQDA_LH use all p variables, not shown. SLDA: Sparse linear discriminant analysis; SLOG: Sparse logistic regression with interactions; SQDA_LH: Sparse QDA of Le and Hastie [30]; SQDA_LS: Sparse QDA of Li and Shao [31]; SQDA_RF: Sparse QDA via ridge fusion; RDA: Regularized discriminant analysis.
Figure 4:
Figure 4:
Left: Misclassification error rates over 100 splits. Right: Number of variables used in corresponding classification rules. DAP consistently selects the smallest model. SQDA_LS, SQDA_LH and RDA always use all p = 1000 variables, not shown. DAP: Discriminant analysis via projections, proposed method; SQDA_LS: Sparse QDA of Li and Shao [31]; SQDA_LH: Sparse QDA of Le and Hastie [30]; SLDA: Sparse linear discriminant analysis; RDA: Regularized discriminant analysis.

Similar articles

Cited by

References

    1. Bach FR, Consistency of the group Lasso and multiple kernel learning, J. Mach. Learn. Res 9 (2008) 1179–1225.
    1. Barber RF, Drton M, Exact block-wise optimization in group lasso and sparse group lasso for linear regression, arXiv.org (2010).
    1. Boyd SP, Vandenberghe L, Convex Optimization, Cambridge Univ Press, Cambridge, 2004.
    1. Breheny P, Huang J, Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors, Statistics and Computing 25 (2015) 173–187. - PMC - PubMed
    1. Cai TT, Liu W, A direct estimation approach to sparse linear discriminant analysis, J. Amer. Statist. Assoc. 106 (2011) 1566–1577.

LinkOut - more resources