Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2006 Feb 1:7:50.
doi: 10.1186/1471-2105-7-50.

Assessing stability of gene selection in microarray data analysis

Affiliations

Assessing stability of gene selection in microarray data analysis

Xing Qiu et al. BMC Bioinformatics. .

Abstract

Background: The number of genes declared differentially expressed is a random variable and its variability can be assessed by resampling techniques. Another important stability indicator is the frequency with which a given gene is selected across subsamples. We have conducted studies to assess stability and some other properties of several gene selection procedures with biological and simulated data.

Results: Using resampling techniques we have found that some genes are selected much less frequently (across sub-samples) than other genes with the same adjusted p-values. The extent to which this type of instability manifests itself can be assessed by a method introduced in this paper. The effect of correlation between gene expression levels on the performance of multiple testing procedures is studied by computer simulations.

Conclusion: Resampling represents a tool for reducing the set of initially selected genes to those with a sufficiently high selection frequency. Using resampling techniques it is also possible to assess variability of different performance indicators. Stability properties of several multiple testing procedures are described at length in the present paper.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Histograms of the frequency of occurence in the set of selected genes obtained by delete-7-jackknife subsampling from the SJCRH data.
Figure 2
Figure 2
Frequency of occurrence in the set of selected genes versus adjusted p-values for the t-test with Bonferroni adjustment. Left panel: delete-1-jackknife subsampling, right panel: delete-7-jackknife subsampling.
Figure 3
Figure 3
Frequency of occurrence in the set of selected genes versus adjusted p-values for the t- and Cramér-von Mises test with Bonferroni adjustment.
Figure 4
Figure 4
Frequency of occurrence in the set of selected genes versus adjusted p-values for the t- and Cramér-von Mises test with Westfall-Young algorithm.
Figure 5
Figure 5
Frequency of occurrence in the set of selected genes versus adjusted p-value for the t-test with Bonferroni adjustment and Westfall-Young algorithm.
Figure 6
Figure 6
Frequency of occurrence in the set of selected genes versus adjusted p-values for the Cramér-von Mises test with Bonferroni adjustment and Westfall-Young algorithm.
Figure 7
Figure 7
Histograms of the number of selected genes across 200 subsamples for different methods applied to the SJCRH data.

References

    1. Qiu X, Brooks AI, Klebanov L, Yakovlev A. The effects of normalization on the correlation structure of microarray data. BMC Bioinformatics. 2005;6:120. doi: 10.1186/1471-2105-6-120. - DOI - PMC - PubMed
    1. Sauerbrei W, Schumacher M. A bootstrapping resampling procedure for model building: application to the Cox regression model. Statistics in Medicine. 1993;11:2093–2109. - PubMed
    1. Pavlidis P, Li Q, Noble WS. The effect of replication on gene expression microarray experiments. Bioinformatics. 2003;19:1620–1627. doi: 10.1093/bioinformatics/btg227. - DOI - PubMed
    1. Stolovitzky G. Gene selection in microarray data: the elephant, the blind men and our algorithms. Current Opinion in Structural Biology. 2003;13:370–376. doi: 10.1016/S0959-440X(03)00078-2. - DOI - PubMed
    1. Politis DN, Romano JP. Large sample confidence regions based on subsamples under minimal assumptions. The Annals of Statistics. 1994;22:2031–2050.

Publication types