Multi-class tumor classification by discriminant partial least squares using microarray gene expression data and assessment of classification models
- PMID: 15261154
- DOI: 10.1016/j.compbiolchem.2004.05.002
Multi-class tumor classification by discriminant partial least squares using microarray gene expression data and assessment of classification models
Abstract
High-throughput DNA microarray provides an effective approach to the monitoring of expression levels of thousands of genes in a sample simultaneously. One promising application of this technology is the molecular diagnostics of cancer, e.g. to distinguish normal tissue from tumor or to classify tumors into different types or subtypes. One problem arising from the use of microarray data is how to analyze the high-dimensional gene expression data, typically with thousands of variables (genes) and much fewer observations (samples). There is a need to develop reliable classification methods to make full use of microarray data and to evaluate accurately the predictive ability and reliability of such derived models. In this paper, discriminant partial least squares was used to classify the different types of human tumors using four microarray datasets and showed good prediction performance. Four different cross-validation procedures (leave-one-out versus leave-half-out; incomplete versus full) were used to evaluate the classification model. Our results indicate that discriminant partial least squares using leave-half-out cross-validation provides a more realistic estimate of the predictive ability of a classification model, which may be overestimated by some of the cross-validation procedures, and the information obtained from different cross-validation procedures can be used to evaluate the reliability of the classification model.
Similar articles
-
Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data.Nucleic Acids Res. 2005 Jan 7;33(1):56-65. doi: 10.1093/nar/gki144. Print 2005. Nucleic Acids Res. 2005. PMID: 15640445 Free PMC article.
-
PCA disjoint models for multiclass cancer analysis using gene expression data.Bioinformatics. 2003 Mar 22;19(5):571-8. doi: 10.1093/bioinformatics/btg051. Bioinformatics. 2003. PMID: 12651714
-
Multi-class cancer classification via partial least squares with gene expression profiles.Bioinformatics. 2002 Sep;18(9):1216-26. doi: 10.1093/bioinformatics/18.9.1216. Bioinformatics. 2002. PMID: 12217913
-
Chips help diagnosis of childhood cancers.Trends Cell Biol. 2001 Aug;11(8):323. doi: 10.1016/s0962-8924(01)02078-5. Trends Cell Biol. 2001. PMID: 11489636 No abstract available.
-
The properties of high-dimensional data spaces: implications for exploring gene and protein expression data.Nat Rev Cancer. 2008 Jan;8(1):37-49. doi: 10.1038/nrc2294. Nat Rev Cancer. 2008. PMID: 18097463 Free PMC article. Review.
Cited by
-
A novel method incorporating gene ontology information for unsupervised clustering and feature selection.PLoS One. 2008;3(12):e3860. doi: 10.1371/journal.pone.0003860. Epub 2008 Dec 4. PLoS One. 2008. PMID: 19052637 Free PMC article.
-
Epigenome-Wide Tobacco-Related Methylation Signature Identification and Their Multilevel Regulatory Network Inference for Lung Adenocarcinoma.Biomed Res Int. 2020 Apr 24;2020:2471915. doi: 10.1155/2020/2471915. eCollection 2020. Biomed Res Int. 2020. PMID: 32420331 Free PMC article.
-
Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.BMC Bioinformatics. 2011 Jun 22;12:253. doi: 10.1186/1471-2105-12-253. BMC Bioinformatics. 2011. PMID: 21693065 Free PMC article.
-
A comparative study of discriminating human heart failure etiology using gene expression profiles.BMC Bioinformatics. 2005 Aug 24;6:205. doi: 10.1186/1471-2105-6-205. BMC Bioinformatics. 2005. PMID: 16120216 Free PMC article.
-
Multi-class cancer classification by total principal component regression (TPCR) using microarray gene expression data.Nucleic Acids Res. 2005 Jan 7;33(1):56-65. doi: 10.1093/nar/gki144. Print 2005. Nucleic Acids Res. 2005. PMID: 15640445 Free PMC article.
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources