A unified mixed effects model for gene set analysis of time course microarray experiments

Lily Wang¹, Xi Chen, Russell D Wolfinger, Jeffrey L Franklin, Robert J Coffey, Bing Zhang

Affiliations

PMID: 19954419
PMCID: PMC2861317
DOI: 10.2202/1544-6115.1484

A unified mixed effects model for gene set analysis of time course microarray experiments

Lily Wang et al. Stat Appl Genet Mol Biol. 2009.

. 2009;8(1):Article 47.

doi: 10.2202/1544-6115.1484. Epub 2009 Nov 7.

Authors

Lily Wang¹, Xi Chen, Russell D Wolfinger, Jeffrey L Franklin, Robert J Coffey, Bing Zhang

Affiliation

¹ Vanderbilt University, USA. lily.wang@vanderbilt.edu

PMID: 19954419
PMCID: PMC2861317
DOI: 10.2202/1544-6115.1484

Abstract

Methods for gene set analysis test for coordinated changes of a group of genes involved in the same biological process or molecular pathway. Higher statistical power is gained for gene set analysis by combining weak signals from a number of individual genes in each group. Although many gene set analysis methods have been proposed for microarray experiments with two groups, few can be applied to time course experiments. We propose a unified statistical model for analyzing time course experiments at the gene set level using random coefficient models, which fall into the more general class of mixed effects models. These models include a systematic component that models the mean trajectory for the group of genes, and a random component (the random coefficients) that models how each gene's trajectory varies about the mean trajectory. We show that the proposed model (1) outperforms currently available methods at discriminating gene sets differentially changed over time from null gene sets; (2) provides more stable results that are less affected by sampling variations; (3) models dependency among genes adequately and preserves type I error rate; and (4) allows for gene ranking based on predicted values of the random effects. We describe simulation studies using gene expression data with "real life" correlations and we demonstrate the proposed random coefficient model using a mouse colon development time course dataset. The agreement between results of the proposed random coefficient model and the previous reports for this proof-of-concept trial further validates this methodology, which provides a unified statistical model for systems analysis of microarray experiments with complex experimental designs when re-sampling based methods are difficult to apply.

PubMed Disclaimer

Figures

**Figure 1**
An illustration for the computation of the design matrix for the random effects {*r_l*; l = 1, …, p} in Model 1. This gene set has 3 genes (variables) and the dataset has 12 samples (observations). Covariance Matrix = estimated gene-gene covariance matrix ∑̂. Under “Eigenvectors”, Prin 1 = the estimated first eigenvector α̂₁ of ∑̂, and λ̂₁ = 0.09802458 is the estimated first eigenvalue of ∑̂. The column in design matrix corresponding to r₁ is then $\sqrt{0.098} {\hat{α}}_{1}$ , note that they vary according to genes, so the random effects have sub-index i in Model 1.

**Figure 2:**
ROC Curves for Testing the Central Null Hypothesis *H_0C: the average gene expression of a gene group is not differentially expressed over time*. The receiver operating characteristic (ROC) curves show a trade-off between sensitivity and 1-specificity as the significance cutoff is varied. Among all models, the random coefficient model MMevct had the best sensitivities across all levels of specificity, the model rstudent performed comparably, Fisher’s exact test lacked sensitivity while *globalANCOVA* lacked specificity.

See this image and copyright information in PMC

References

1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. doi: 10.1038/75556. - DOI - PMC - PubMed
1. Barry WT, Nobel AB, Wright FA. Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics. 2005;21:1943–1949. doi: 10.1093/bioinformatics/bti260. - DOI - PubMed
1. Chen X, Wang L, Smith JD, Zhang B. Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes. Bioinformatics. 2008;24:2474–2481. doi: 10.1093/bioinformatics/btn458. - DOI - PMC - PubMed
1. Chu TM, Weir B, Wolfinger R. A systematic statistical linear modeling approach to oligonucleotide array experiments. Math Biosci. 2002;176:35–51. doi: 10.1016/S0025-5564(01)00107-9. - DOI - PubMed
1. Churchill GA, Doerge RW. Naive application of permutation testing leads to inflated type I error rates. Genetics. 2008;178:609–610. doi: 10.1534/genetics.107.074609. - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

A unified mixed effects model for gene set analysis of time course microarray experiments

Affiliation

A unified mixed effects model for gene set analysis of time course microarray experiments

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources