Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 Nov 13;367(1906):4237-53.
doi: 10.1098/rsta.2009.0159.

Statistical challenges of high-dimensional data

Affiliations

Statistical challenges of high-dimensional data

Iain M Johnstone et al. Philos Trans A Math Phys Eng Sci. .

Abstract

Modern applications of statistical theory and methods can involve extremely large datasets, often with huge numbers of measurements on each of a comparatively small number of experimental units. New methodology and accompanying theory have emerged in response: the goal of this Theme Issue is to illustrate a number of these recent developments. This overview article introduces the difficulties that arise with high-dimensional data in the context of the very familiar linear statistical model: we give a taste of what can nevertheless be achieved when the parameter vector of interest is sparse, that is, contains many zero elements. We describe other ways of identifying low-dimensional subspaces of the data space that contain all useful information. The topic of classification is then reviewed along with the problem of identifying, from within a very large set, the variables that help to classify observations. Brief mention is made of the visualization of high-dimensional data and ways to handle computational problems in Bayesian analysis are described. At appropriate points, reference is made to the other papers in the issue.

PubMed Disclaimer

References

    1. Adragni K. P., Cook R. D. Inpress Sufficient dimension reduction and prediction in regression. Phil. Trans. R. Soc. A. (10.1098/rsta.2009.0110) - DOI - PubMed
    1. Banks D. L., House L., Killhoury K. Inpress Cherry-picking for complex data: robust structure recovery. Phil. Trans. R. Soc. A. (10.1098/rsta.2009.0119) - DOI - PubMed
    1. Barber D. Inpress Identifying graph clusters using variational inference and links to covariance parameterisation Phil. Trans. R. Soc. A. (10.1098/rsta.2009.0117) - DOI - PubMed
    1. Beal M. J., Ghahramani Z. 2006. Variational Bayesian learning of directed graphical models with hidden variables. Bayesian Stat. 1, 793–822. (10.1214/06-BA126) - DOI
    1. Belabbas M-A., Wolfe P. J. Inpress On landmark selection and sampling in high-dimensional data analysis. Phil. Trans. R. Soc. A. (10.1098/rsta.2009.0161) - DOI - PMC - PubMed

Publication types

LinkOut - more resources