Tumor classification by partial least squares using microarray gene expression data
- PMID: 11836210
- DOI: 10.1093/bioinformatics/18.1.39
Tumor classification by partial least squares using microarray gene expression data
Abstract
Motivation: One important application of gene expression microarray data is classification of samples into categories, such as the type of tumor. The use of microarrays allows simultaneous monitoring of thousands of genes expressions per sample. This ability to measure gene expression en masse has resulted in data with the number of variables p(genes) far exceeding the number of samples N. Standard statistical methodologies in classification and prediction do not work well or even at all when N < p. Modification of existing statistical methodologies or development of new methodologies is needed for the analysis of microarray data.
Results: We propose a novel analysis procedure for classifying (predicting) human tumor samples based on microarray gene expressions. This procedure involves dimension reduction using Partial Least Squares (PLS) and classification using Logistic Discrimination (LD) and Quadratic Discriminant Analysis (QDA). We compare PLS to the well known dimension reduction method of Principal Components Analysis (PCA). Under many circumstances PLS proves superior; we illustrate a condition when PCA particularly fails to predict well relative to PLS. The proposed methods were applied to five different microarray data sets involving various human tumor samples: (1) normal versus ovarian tumor; (2) Acute Myeloid Leukemia (AML) versus Acute Lymphoblastic Leukemia (ALL); (3) Diffuse Large B-cell Lymphoma (DLBCLL) versus B-cell Chronic Lymphocytic Leukemia (BCLL); (4) normal versus colon tumor; and (5) Non-Small-Cell-Lung-Carcinoma (NSCLC) versus renal samples. Stability of classification results and methods were further assessed by re-randomization studies.
Similar articles
-
Multi-class cancer classification via partial least squares with gene expression profiles.Bioinformatics. 2002 Sep;18(9):1216-26. doi: 10.1093/bioinformatics/18.9.1216. Bioinformatics. 2002. PMID: 12217913
-
Dimension reduction for classification with gene expression microarray data.Stat Appl Genet Mol Biol. 2006;5:Article6. doi: 10.2202/1544-6115.1147. Epub 2006 Feb 24. Stat Appl Genet Mol Biol. 2006. PMID: 16646870
-
Gene expression data classification using consensus independent component analysis.Genomics Proteomics Bioinformatics. 2008 Jun;6(2):74-82. doi: 10.1016/S1672-0229(08)60022-4. Genomics Proteomics Bioinformatics. 2008. PMID: 18973863 Free PMC article.
-
[Progress in the molecular classification of neoplasms in children].Postepy Hig Med Dosw (Online). 2008 May 20;62:222-40. Postepy Hig Med Dosw (Online). 2008. PMID: 18542043 Review. Polish.
-
Filter versus wrapper gene selection approaches in DNA microarray domains.Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007. Artif Intell Med. 2004. PMID: 15219288 Review.
Cited by
-
The promise of multi-omics and clinical data integration to identify and target personalized healthcare approaches in autism spectrum disorders.OMICS. 2015 Apr;19(4):197-208. doi: 10.1089/omi.2015.0020. OMICS. 2015. PMID: 25831060 Free PMC article. Review.
-
Characteristics and predictive value of blood transcriptome signature in males with autism spectrum disorders.PLoS One. 2012;7(12):e49475. doi: 10.1371/journal.pone.0049475. Epub 2012 Dec 5. PLoS One. 2012. PMID: 23227143 Free PMC article.
-
Multiplexed Component Analysis to Identify Genes Contributing to the Immune Response during Acute SIV Infection.PLoS One. 2015 May 18;10(5):e0126843. doi: 10.1371/journal.pone.0126843. eCollection 2015. PLoS One. 2015. PMID: 25984721 Free PMC article.
-
Predicting transcription factor activities from combined analysis of microarray and ChIP data: a partial least squares approach.Theor Biol Med Model. 2005 Jun 24;2:23. doi: 10.1186/1742-4682-2-23. Theor Biol Med Model. 2005. PMID: 15978125 Free PMC article.
-
Feature selection for fMRI-based deception detection.BMC Bioinformatics. 2009 Sep 17;10 Suppl 9(Suppl 9):S15. doi: 10.1186/1471-2105-10-S9-S15. BMC Bioinformatics. 2009. PMID: 19761569 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Research Materials