Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Jun 28:15:226.
doi: 10.1186/1471-2105-15-226.

Meta-analysis based on weighted ordered P-values for genomic data with heterogeneity

Affiliations

Meta-analysis based on weighted ordered P-values for genomic data with heterogeneity

Yihan Li et al. BMC Bioinformatics. .

Abstract

Background: Meta-analysis has become increasingly popular in recent years, especially in genomic data analysis, due to the fast growth of available data and studies that target the same questions. Many methods have been developed, including classical ones such as Fisher's combined probability test and Stouffer's Z-test. However, not all meta-analyses have the same goal in mind. Some aim at combining information to find signals in at least one of the studies, while others hope to find more consistent signals across the studies. While many classical meta-analysis methods are developed with the former goal in mind, the latter goal has much more practicality for genomic data analysis.

Results: In this paper, we propose a class of meta-analysis methods based on summaries of weighted ordered p-values (WOP) that aim at detecting significance in a majority of studies. We consider weighted versions of classical procedures such as Fisher's method and Stouffer's method where the weight for each p-value is based on its order among the studies. In particular, we consider weights based on the binomial distribution, where the median of the p-values are weighted highest and the outlying p-values are down-weighted. We investigate the properties of our methods and demonstrate their strengths through simulations studies, comparing to existing procedures. In addition, we illustrate application of the proposed methodology by several meta-analysis of gene expression data.

Conclusions: Our proposed weighted ordered p-value (WOP) methods displayed better performance compared to existing methods for testing the hypothesis that there is signal in the majority of studies. They also appeared to be much more robust in applications compared to the rth ordered p-value (rOP) method (Song and Tseng, Ann. Appl. Stat. 2014, 8(2):777-800). With the flexibility of incorporating different p-value combination methods and different weighting schemes, the weighted ordered p-values (WOP) methods have great potential in detecting consistent signal in meta-analysis with heterogeneity.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustration of weighting schemes. Intuition behind the three weighting schemes wb1, wb2 and wb3 for m<rK. The plot reflects an example of K=9. The weighting scheme wb for testing HS5 is plotted as a reference. The other three weighting schemes are for testing HS7. From the plot we can easily see that wb1 is a direct shift of wb, which is based on the distribution of B(8,0.5). wb2 is based on the distribution of B(12,0.5), and wb3 is based on the distribution of B(4,0.5).
Figure 2
Figure 2
Comparison of power. Power comparison for the different methods. The x-axis denotes the 8 categories of genes, categorized by the number of studies that the genes are differentially expressed in. There are 1650 genes in the category 0 (no differential expression in any studies). The rest of the categories contain 50 genes each. For each category, the proportion of genes found significant within that category are plotted for each method.
Figure 3
Figure 3
ROC curves. ROC curves for the different methods. Rejections of genes differentially expressed in less than 4 studies are considered false positives. Rejections of genes differentially expressed in 4 or more studies are consider true positives.
Figure 4
Figure 4
Venn diagram for stem cell studies. Venn diagram for the probesets found significant by the binomial weighted Fisher’s method, the half-binomial weighted Fisher’s method and the rOP method.
Figure 5
Figure 5
P-value patterns for different methods. Pattern of the original ordered p-values from the 9 studies for probesets detected by one of the three methods only. The x-axis is the order of the p-values from the 9 studies. The y-axis is the p-values. The plot includes a random subset of 20 probesets that are detected exclusively by each of the three methods.
Figure 6
Figure 6
Venn diagram for MDD studies. Venn diagram for the genes found significant in the meta-analysis of the MDD studies by the rOP method based on r=m, the rOP method based on selected r, and the half binomial weighted Fisher’s method. In this case, m=5 and the selected r=7.

References

    1. Song C, Tseng GC. Hypothesis setting and order statistic for robust genomic meta-analysis. Ann Appl Stat. 2014;8:777–800. - PMC - PubMed
    1. Fisher RA. Statistical methods for research workers. Oliver and Boyd: Edinburgh; 1925.
    1. Stouffer SA, Suchman EA, Devinney LC, Star SA, Williams RM. The American soldier: adjustment during army life. Princeton, NJ: Princeton University Press; 1949.
    1. Choi JK, Yu U, Kim S, Yoo OJ. Combining multiple microarray studies and modeling interstudy variation. Bioinformatics. 2003;19:84–90. - PubMed
    1. Wilkinson B. A statistical consideration in psychological research. Psychol Bull. 1951;48:156–158. - PubMed

Publication types

LinkOut - more resources