Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Feb 27:7:95.
doi: 10.1186/1471-2105-7-95.

Gene selection algorithms for microarray data based on least squares support vector machine

Affiliations

Gene selection algorithms for microarray data based on least squares support vector machine

E Ke Tang et al. BMC Bioinformatics. .

Abstract

Background: In discriminant analysis of microarray data, usually a small number of samples are expressed by a large number of genes. It is not only difficult but also unnecessary to conduct the discriminant analysis with all the genes. Hence, gene selection is usually performed to select important genes.

Results: A gene selection method searches for an optimal or near optimal subset of genes with respect to a given evaluation criterion. In this paper, we propose a new evaluation criterion, named the leave-one-out calculation (LOOC, A list of abbreviations appears just above the list of references) measure. A gene selection method, named leave-one-out calculation sequential forward selection (LOOCSFS) algorithm, is then presented by combining the LOOC measure with the sequential forward selection scheme. Further, a novel gene selection algorithm, the gradient-based leave-one-out gene selection (GLGS) algorithm, is also proposed. Both of the gene selection algorithms originate from an efficient and exact calculation of the leave-one-out cross-validation error of the least squares support vector machine (LS-SVM). The proposed approaches are applied to two microarray datasets and compared to other well-known gene selection methods using codes available from the second author.

Conclusion: The proposed gene selection approaches can provide gene subsets leading to more accurate classification results, while their computational complexity is comparable to the existing methods. The GLGS algorithm can also better scale to datasets with a very large number of genes.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The external B.632+ error for Hepatocellular Carcinoma dataset, shown vs the number of selected genes.
Figure 2
Figure 2
The external B.632+ error for Glioma dataset, shown vs the number of selected genes.
Figure 3
Figure 3
The computational time of seven gene selection algorithms on Hepatocellular Carcinoma dataset, shown vs the number of selected genes.
Figure 4
Figure 4
The computational time of seven gene selection algorithms on Hepatocellular Carcinoma dataset, shown vs the size of the gene set.
Figure 5
Figure 5
The LOOCSFS gene selection algorithm.
Figure 6
Figure 6
The GLGS gene selection algorithm.

References

    1. Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD, Lander ES. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science. 1999;286:531–537. - PubMed
    1. Iizuka N, Oka M, Yamada-Okabe H, Nishida M, Maeda Y, Mori N, Takao T, Tamesa T, Tangoku A, Tabuchi H, Hamada K, Nakayama H, Ishitsuka H, Miyamoto T, Hirabayashi A, Uchimura S, Hamamoto Y. Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection. The Lancet. 2003;361:923–929. - PubMed
    1. Nutt CL, Mani DR, Bentensky RA, Tamayo P, Cairncross JG, Ladd C, Pohl U, Hartmann C, McLaughlin ME, Batchelor TT, Black PM, Von Deimling A, Pomeroy SL, Golub TR, Louis DN. Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Research. 2003;63:1602–1607. - PubMed
    1. Kohavi R, John GH. Wrappers for feature subset selection. Artificial Intelligence. 1997;97:273–324.
    1. Cho SB. Exploring features and classifiers to classify gene expression profiles of acute leukaemia. International Journal of Pattern Recognition and Artificial Intelligence. 2002;16:831–844.

Publication types

MeSH terms

LinkOut - more resources