Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2011 Sep 15;10(1):42.
doi: 10.2202/1544-6115.1701.

Fully moderated T-statistic for small sample size gene expression arrays

Affiliations
Comparative Study

Fully moderated T-statistic for small sample size gene expression arrays

Lianbo Yu et al. Stat Appl Genet Mol Biol. .

Abstract

Gene expression microarray experiments with few replications lead to great variability in estimates of gene variances. Several Bayesian methods have been developed to reduce this variability and to increase power. Thus far, moderated t methods assumed a constant coefficient of variation (CV) for the gene variances. We provide evidence against this assumption, and extend the method by allowing the CV to vary with gene expression. Our CV varying method, which we refer to as the fully moderated t-statistic, was compared to three other methods (ordinary t, and two moderated t predecessors). A simulation study and a familiar spike-in data set were used to assess the performance of the testing methods. The results showed that our CV varying method had higher power than the other three methods, identified a greater number of true positives in spike-in data, fit simulated data under varying assumptions very well, and in a real data set better identified higher expressing genes that were consistent with functional pathways associated with the experiments.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Relationship between hyperparameters and gene expression for Simulation Model 1. A) Prior variance ( s0g2). B) Prior degrees of freedom (d0g).
Figure 2:
Figure 2:
Power plots of four testing methods under 2 different simulation models. Power averaged over 100 simulated datasets was calculated separately for the simulated data by using 4 different testing methods: t-test (purple), SMT (blue), IBMT (red), and FMT (black). A) FMT simulation model (Model 1). B) SMT simulation model (Model 2).
Figure 3:
Figure 3:
Prior degrees of freedom estimated by different moderated testing methods under the FMT simulation model (Model 1). Prior degrees of freedom were estimated using: SMT (blue), IBMT (red), and FMT (green) under the FMT simulation model (Model 1) and averaged over 100 simulated datasets.
Figure 4:
Figure 4:
Observed true positives detected at different expression levels. Observed true positives averaged over 100 simulated datasets were plotted under 4 different testing methods: t-test (purple), SMT (blue), IBMT (red), and FMT (black) at three expression levels (low (1), medium (2), and high (3)) when PFER equals 5 under Model 1. (Numbers in parentheses in the legend indicate the number of true positives).
Figure 5:
Figure 5:
Observed false positives detected at different expression levels. Observed false positives averaged over 100 simulated datasets were counted by using 4 different testing methods: t-test (purple), SMT (blue), IBMT (red), and FMT (black) at three expression levels (low (1), medium (2), and high (3)) when PFER equals 5 under Model 1. (Numbers in parentheses in the legend indicate the numbers of true negatives).
Figure 6:
Figure 6:
Comparison of false positives among top 300 ranked genes. Genes were ranked by p-values, and the corresponding false positive numbers averaged over 100 simulated datasets were obtained separately for four different testing methods: t-test (purple), SMT (blue), IBMT (red), and FMT (black) when PFER equals 5 under Model 1.
Figure 7:
Figure 7:
Summary plots of LOESS smoothing curves for estimating prior degrees of freedom. A) Mean of LOESS curves over 100 simulations for the window sizes of m = 10, 40, 200, 600, and 2000 genes. Black curve represents the true model. B) Standard deviation of LOESS curves over 100 simulations for the window sizes of m = 10, 40, 200, 600, and 2000 genes.
Figure 8:
Figure 8:
Spike-in data prior degrees of freedom estimates. SMT (blue), IBMT (red), and FMT (black) methods were used to estimate prior degrees of freedom over average log expressions.
Figure 9:
Figure 9:
Comparison of false positives among top ranked genes. Gene ranks based on p-values were obtained separately for 4 different testing methods: t-test (purple), SMT (blue), IBMT (red), and FMT (black). False positives counts were determined from among the top ranked genes.
Figure 10:
Figure 10:
Estimated prior degrees of freedom and variance of log variances against log intensity in the real data example. A) Estimated prior degrees of freedom by FMT (green), IBMT (red), and SMT (blue). B) Estimated variance of log variances by FMT. Moving average (a window size of 40 genes) estimate (green) was obtained by LOESS local regression with span=0.95.
Figure 11:
Figure 11:
Estimated prior variance against log intensity in the real data example. Prior variance estimates by FMT (green) and IBMT (red) were obtained through fitting a LOESS local regression on the adjusted log-variance eg with span=0.75. Differences between FMT and IBMT are mainly due to differences in estimated prior degrees of freedom. Prior variance estimate for SMT (blue) is constant.
Figure 12:
Figure 12:
Venn Diagram of the 4 significant gene lists from 4 different testing methods:t-test, SMT, IBMT, and FMT

Similar articles

Cited by

References

    1. Benjamini Y, Hochberg Y. “Controlling the false discovery rate: A practical and powerful approach to multiple testing,”. J. Roy. Statist. Soc. Ser. B. 1995;57:289–300.
    1. Choe SE, Boutros M, Michelson AM, Church GM, Halfon MS. “Preferred analysis methods for affymetrix genechips revealed by a wholly defined control dataset,”. Genome Biology. 2005;6(2):R16. doi: 10.1186/gb-2005-6-2-r16. - DOI - PMC - PubMed
    1. Cleveland WS. “Robust locally weighted regression and smoothing scatterplots,”. Journal of the American Statistical Association. 1979;74:829–836. doi: 10.2307/2286407. - DOI
    1. Cleveland WS, Devlin SJ. “Locally-weighted regression: An approach to regression analysis by local fitting,”. Journal of the American Statistical Association. 1988;83:596–610. doi: 10.2307/2289282. - DOI
    1. Cui X, Hwang JT, Qiu J, Blades NJ, Churchill GA. “Improved statistical tests for differential gene expression by shrinking variance components estimates,”. Biostatistics. 2005;6:59–75. doi: 10.1093/biostatistics/kxh018. - DOI - PubMed

Publication types

LinkOut - more resources