Machine learning models for lung cancer classification using array comparative genomic hybridization
- PMID: 12463776
- PMCID: PMC2244172
Machine learning models for lung cancer classification using array comparative genomic hybridization
Abstract
Array CGH is a recently introduced technology that measures changes in the gene copy number of hundreds of genes in a single experiment. The primary goal of this study was to develop machine learning models that classify non-small Lung Cancers according to histopathology types and to compare several machine learning methods in this learning task. DNA from tumors of 37 patients (21 squamous carcinomas, and 16 adenocarcinomas) were extracted and hybridized onto a 452 BAC clone array. The following algorithms were used: KNN, Decision Tree Induction, Support Vector Machines and Feed-Forward Neural Networks. Performance was measured via leave-one-out classification accuracy. The best multi-gene model found had a leave-one-out accuracy of 89.2%. Decision Trees performed poorer than the other methods in this learning task and dataset. We conclude that gene copy numbers as measured by array CGH are, collectively, an excellent indicator of histological subtype. Several interesting research directions are discussed.
Similar articles
-
Supervised classification of array CGH data with HMM-based feature selection.Pac Symp Biocomput. 2009:468-79. Pac Symp Biocomput. 2009. PMID: 19209723
-
Gain at chromosomal region 5p15.33, containing TERT, is the most frequent genetic event in early stages of non-small cell lung cancer.Cancer Genet Cytogenet. 2008 Apr 1;182(1):1-11. doi: 10.1016/j.cancergencyto.2007.12.004. Cancer Genet Cytogenet. 2008. PMID: 18328944
-
Evaluation of Machine Learning Algorithm Utilization for Lung Cancer Classification Based on Gene Expression Levels.Asian Pac J Cancer Prev. 2016;17(2):835-8. doi: 10.7314/apjcp.2016.17.2.835. Asian Pac J Cancer Prev. 2016. PMID: 26925688
-
A reevaluation of the clinical significance of histological subtyping of non--small-cell lung carcinoma: diagnostic algorithms in the era of personalized treatments.Int J Surg Pathol. 2009 Jun;17(3):206-18. doi: 10.1177/1066896909336178. Int J Surg Pathol. 2009. PMID: 19443885 Review.
-
Chromosomal imbalances in human lung cancer.Oncogene. 2002 Oct 7;21(45):6877-83. doi: 10.1038/sj.onc.1205836. Oncogene. 2002. PMID: 12362270 Review.
Cited by
-
Machine Learning for Lung Cancer Diagnosis, Treatment, and Prognosis.Genomics Proteomics Bioinformatics. 2022 Oct;20(5):850-866. doi: 10.1016/j.gpb.2022.11.003. Epub 2022 Dec 1. Genomics Proteomics Bioinformatics. 2022. PMID: 36462630 Free PMC article. Review.
-
Differential gene expression analysis reveals novel genes and pathways in pediatric septic shock patients.Sci Rep. 2019 Aug 2;9(1):11270. doi: 10.1038/s41598-019-47703-6. Sci Rep. 2019. PMID: 31375728 Free PMC article.
-
The molecular basis of lung cancer: molecular abnormalities and therapeutic implications.Respir Res. 2003;4(1):12. doi: 10.1186/1465-9921-4-12. Epub 2003 Oct 7. Respir Res. 2003. PMID: 14641911 Free PMC article. Review.
-
Identification of potential tissue-specific cancer biomarkers and development of cancer versus normal genomic classifiers.Oncotarget. 2017 Sep 21;8(49):85692-85715. doi: 10.18632/oncotarget.21127. eCollection 2017 Oct 17. Oncotarget. 2017. PMID: 29156751 Free PMC article.
References
-
- Nat Genet. 1998 Oct;20(2):207-11 - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Medical