Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2010 Jan;20(1):101-148.

A Selective Overview of Variable Selection in High Dimensional Feature Space

Affiliations

A Selective Overview of Variable Selection in High Dimensional Feature Space

Jianqing Fan et al. Stat Sin. 2010 Jan.

Abstract

High dimensional statistical problems arise from diverse fields of scientific research and technological development. Variable selection plays a pivotal role in contemporary statistical learning and scientific discoveries. The traditional idea of best subset selection methods, which can be regarded as a specific form of penalized likelihood, is computationally too expensive for many modern statistical applications. Other forms of penalized likelihood methods have been successfully developed over the last decade to cope with high dimensionality. They have been widely applied for simultaneously selecting important variables and estimating their effects in high dimensional statistical inference. In this article, we present a brief account of the recent developments of theory, methods, and implementations for high dimensional variable selection. What limits of the dimensionality such methods can handle, what the role of penalty functions is, and what the statistical properties are rapidly drive the advances of the field. The properties of non-concave penalized likelihood and its roles in high dimensional statistical modeling are emphasized. We also review some recent advances in ultra-high dimensional variable selection, with emphasis on independence screening and two-scale methods.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Distributions (left panel) of the maximum absolute sample correlation coefficient max2≤jp |corr(Z1,Zj)|, and distributions (right panel) of the maximum absolute multiple correlation coefficient of Z1 with 5 other variables (max|S|=5|corr(Z1,ZSTβ^S)|, where β̂S is the regression coefficient of Z1 regressed on ZS, a subset of variables indexed by S and excluding Z1), computed by the stepwise addition algorithm (the actual values are larger than what are presented here), when n = 50, p = 1000 (solid curve) and p = 10000 (dashed), based on 1000 simulations.
Figure 2
Figure 2
Some commonly used penalty functions (left panel) and their derivatives (right panel). They correspond to the risk functions shown in the right panel of Figure 3. More precisely, λ = 2 for hard thresholding penalty, λ = 1.04 for L1-penalty, λ = 1.02 for SCAD with a = 3.7, and λ = 1.49 for MCP with a = 2.
Figure 3
Figure 3
The risk functions for penalized least squares under the Gaussian model for the hard-thresholding penalty, L1-penalty, SCAD (a = 3.7), and MCP (a = 2). The left panel corresponds to λ = 1 and the right panel corresponds to λ = 2 for the hard-thresholding estimator, and the rest of parameters are chosen so that their risks are the same at the point θ = 3.
Figure 4
Figure 4
The local linear (dashed) and local quadratic (dotted) approximations to the SCAD function (solid) with λ = 2 and a = 3.7 at a given point |θ| = 4.
Figure 5
Figure 5
Illustration of ultra-high dimensional variable selection scheme. A large scale screening is first used to screen out unimportant variables and then a moderate-scale searching is applied to further select important variables. At both steps, one can choose a favorite method.

References

    1. Abramovich F, Benjamini Y, Donoho D, Johnstone I. Adapting to unknown sparsity by controlling the false discovery rate. Ann. Statist. 2006;34:584–653.
    1. Akaike H. Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Csaki F, editors. Second International Symposium on Information Theory. Budapest: Akademiai Kiado; 1973. pp. 267–281.
    1. Akaike H. A new look at the statistical model identification. IEEE Trans. Auto. Control. 1974;19:716–723.
    1. Antoniadis A. Smoothing noisy data with tapered coiflets series. Scand. J. Statist. 1996;23:313–330.
    1. Antoniadis A, Fan J. Regularization of wavelets approximations (with discussion) J. Amer. Statist. Assoc. 2001;96:939–967.

LinkOut - more resources