Dimension reduction for high-dimensional data
- PMID: 20652514
- DOI: 10.1007/978-1-60761-580-4_14
Dimension reduction for high-dimensional data
Abstract
With advancing of modern technologies, high-dimensional data have prevailed in computational biology. The number of variables p is very large, and in many applications, p is larger than the number of observational units n. Such high dimensionality and the unconventional small-n-large-p setting have posed new challenges to statistical analysis methods. Dimension reduction, which aims to reduce the predictor dimension prior to any modeling efforts, offers a potentially useful avenue to tackle such high-dimensional regression. In this chapter, we review a number of commonly used dimension reduction approaches, including principal component analysis, partial least squares, and sliced inverse regression. For each method, we review its background and its applications in computational biology, discuss both its advantages and limitations, and offer enough operational details for implementation. A numerical example of analyzing a microarray survival data is given to illustrate applications of the reviewed reduction methods.
Similar articles
-
Sliced inverse regression with regularizations.Biometrics. 2008 Mar;64(1):124-31. doi: 10.1111/j.1541-0420.2007.00836.x. Epub 2007 Jul 25. Biometrics. 2008. PMID: 17651455
-
Partial least squares dimension reduction for microarray gene expression data with a censored response.Math Biosci. 2005 Jan;193(1):119-37. doi: 10.1016/j.mbs.2004.10.007. Epub 2005 Jan 22. Math Biosci. 2005. PMID: 15681279
-
Dimension reduction for classification with gene expression microarray data.Stat Appl Genet Mol Biol. 2006;5:Article6. doi: 10.2202/1544-6115.1147. Epub 2006 Feb 24. Stat Appl Genet Mol Biol. 2006. PMID: 16646870
-
Partial least squares: a versatile tool for the analysis of high-dimensional genomic data.Brief Bioinform. 2007 Jan;8(1):32-44. doi: 10.1093/bib/bbl016. Epub 2006 May 26. Brief Bioinform. 2007. PMID: 16772269 Review.
-
Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis.Epidemiology. 2012 Jul;23(4):583-93. doi: 10.1097/EDE.0b013e31824d57a9. Epidemiology. 2012. PMID: 22407139 Review.
Cited by
-
Applications of Machine Learning (ML) and Mathematical Modeling (MM) in Healthcare with Special Focus on Cancer Prognosis and Anticancer Therapy: Current Status and Challenges.Pharmaceutics. 2024 Feb 9;16(2):260. doi: 10.3390/pharmaceutics16020260. Pharmaceutics. 2024. PMID: 38399314 Free PMC article. Review.
-
A new analysis approach of epidermal growth factor receptor pathway activation patterns provides insights into cetuximab resistance mechanisms in head and neck cancer.BMC Med. 2012 May 1;10:43. doi: 10.1186/1741-7015-10-43. BMC Med. 2012. PMID: 22548923 Free PMC article.
-
Molecular variability elicits a tunable switch with discrete neuromodulatory response phenotypes.J Comput Neurosci. 2016 Feb;40(1):65-82. doi: 10.1007/s10827-015-0584-2. Epub 2015 Dec 1. J Comput Neurosci. 2016. PMID: 26621106 Free PMC article.
-
Research hotspots and frontiers of application of mass spectrometry breath test in respiratory diseases.Front Med (Lausanne). 2025 Aug 13;12:1618588. doi: 10.3389/fmed.2025.1618588. eCollection 2025. Front Med (Lausanne). 2025. PMID: 40880775 Free PMC article. Review.
-
Gene array studies in HIV-1 infection.Curr HIV/AIDS Rep. 2012 Mar;9(1):34-43. doi: 10.1007/s11904-011-0100-x. Curr HIV/AIDS Rep. 2012. PMID: 22184032 Free PMC article. Review.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources