Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Feb 20:15:49.
doi: 10.1186/1471-2105-15-49.

Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm

Affiliations

Gene selection for cancer identification: a decision tree model empowered by particle swarm optimization algorithm

Kun-Huang Chen et al. BMC Bioinformatics. .

Abstract

Background: In the application of microarray data, how to select a small number of informative genes from thousands of genes that may contribute to the occurrence of cancers is an important issue. Many researchers use various computational intelligence methods to analyzed gene expression data.

Results: To achieve efficient gene selection from thousands of candidate genes that can contribute in identifying cancers, this study aims at developing a novel method utilizing particle swarm optimization combined with a decision tree as the classifier. This study also compares the performance of our proposed method with other well-known benchmark classification methods (support vector machine, self-organizing map, back propagation neural network, C4.5 decision tree, Naive Bayes, CART decision tree, and artificial immune recognition system) and conducts experiments on 11 gene expression cancer datasets.

Conclusion: Based on statistical analysis, our proposed method outperforms other popular classifiers for all test datasets, and is compatible to SVM for certain specific datasets. Further, the housekeeping genes with various expression patterns and tissue-specific genes are identified. These genes provide a high discrimination power on cancer classification.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The proposed PSODT for gene selection.
Figure 2
Figure 2
An illustration of partial decision tree.
Figure 3
Figure 3
95% confidence interval of the mean for classification accuracy.

References

    1. Alba E. et al.Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. IEEE C Evol Computat. 2007;9:284–290.
    1. Li S, Wu X, Tan M. Gene selection using hybrid particle swarm optimization and genetic algorithm. Soft Comput. 2008;12:1039–1048. doi: 10.1007/s00500-007-0272-x. - DOI
    1. Ahmad A, Dey L. A feature selection technique for classificatory analysis. Pattern Recogn Lett. 2005;26:43–56. doi: 10.1016/j.patrec.2004.08.015. - DOI
    1. Su Y, Murali TM. et al.RankGene: identification of diagnostic genes based on expression data. Bioinformatics. 2003;19:1578–1579. doi: 10.1093/bioinformatics/btg179. - DOI - PubMed
    1. Kahavi R, John GH. Wrapper for feature subset selection. Artif Intell. 1997;97:273–324. doi: 10.1016/S0004-3702(97)00043-X. - DOI

Publication types