Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:2013:387673.
doi: 10.1155/2013/387673. Epub 2013 Nov 10.

A comparative analysis of biomarker selection techniques

Affiliations

A comparative analysis of biomarker selection techniques

Nicoletta Dessì et al. Biomed Res Int. 2013.

Abstract

Feature selection has become the essential step in biomarker discovery from high-dimensional genomics data. It is recognized that different feature selection techniques may result in different set of biomarkers, that is, different groups of genes highly correlated to a given pathological condition, but few direct comparisons exist which quantify these differences in a systematic way. In this paper, we propose a general methodology for comparing the outcomes of different selection techniques in the context of biomarker discovery. The comparison is carried out along two dimensions: (i) measuring the similarity/dissimilarity of selected gene sets; (ii) evaluating the implications of these differences in terms of both predictive performance and stability of selected gene sets. As a case study, we considered three benchmarks deriving from DNA microarray experiments and conducted a comparative analysis among eight selection methods, representatives of different classes of feature selection techniques. Our results show that the proposed approach can provide useful insight about the pattern of agreement of biomarker discovery techniques.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Similarity evaluation.
Figure 2
Figure 2
Joint evaluation of stability and predictive performance.
Figure 3
Figure 3
Colon dataset: stability versus number of genes.
Figure 4
Figure 4
Leukemia dataset: stability versus number of genes.
Figure 5
Figure 5
Prostate dataset: stability versus number of genes.
Figure 6
Figure 6
Colon dataset: AUC versus number of genes.
Figure 7
Figure 7
Leukemia dataset: AUC versus number of genes.
Figure 8
Figure 8
Prostate dataset: AUC versus number of genes.

References

    1. Arthur JA, Colburn WA, DeGruttola VG, et al. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clinical Pharmacology and Therapeutics. 2001;69(3):89–95. - PubMed
    1. Saeys Y, Inza I, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23(19):2507–2517. - PubMed
    1. Ioannidis JPA. Microarrays and molecular research: noise discovery? The Lancet. 2005;365(9458):454–455. - PubMed
    1. Lai C, Reinders MJT, van’t Veer LJ, Wessels LFA. A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets. BMC Bioinformatics. 2006;7, article 235 - PMC - PubMed
    1. Jeffery IB, Higgins DG, Culhane AC. Comparison and evaluation of methods for generating differentially expressed gene lists from microarray data. BMC Bioinformatics. 2006;7, article 359 - PMC - PubMed

Publication types

Substances