Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Jun;77(2):622-633.
doi: 10.1111/biom.13314. Epub 2020 Jun 28.

Two-group Poisson-Dirichlet mixtures for multiple testing

Affiliations

Two-group Poisson-Dirichlet mixtures for multiple testing

Francesco Denti et al. Biometrics. 2021 Jun.

Abstract

The simultaneous testing of multiple hypotheses is common to the analysis of high-dimensional data sets. The two-group model, first proposed by Efron, identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two-group model framework. Here, we investigate employing mixtures of two-parameter Poisson-Dirichlet Processes instead, and show how they provide a more flexible and effective tool for large-scale hypothesis testing. Our model further employs nonlocal prior densities to allow separation between the two mixture components. We obtain a closed-form expression for the exchangeable partition probability function of the two-group model, which leads to a straightforward Markov Chain Monte Carlo implementation. We compare the performance of our method for large-scale inference in a simulation study and illustrate its use on both a prostate cancer data set and a case-control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate-to-severe diarrhea.

Keywords: Bayesian nonparametrics; Poisson-Dirichlet process; microbiome analysis; multiple testing; two-group model.

PubMed Disclaimer

Figures

FIGURE 1
FIGURE 1
Microbiome data case study: Histogram of 535 z-scores obtained from the case term (β1) in the Negative-Binomial generalized linear mixed effects model. We superimpose the posterior probabilities of the events {γi = 1|z} and the threshold corresponding to a Bayesian FDR of 1%
FIGURE 2
FIGURE 2
Prostate data set: Histogram of 6033 z-scores obtained from a two-group comparison. We superimpose the posterior probabilities of the events {γi = 1|z} and the threshold corresponding to a Bayesian FDR of 20%

References

    1. Ahdesmaki M, Zuber V, Gibb S and Strimmer K (2015) sda: Shrinkage Discriminant Analysis and CAT Score Variable Selection. R Package Version 1.3.7.
    1. Benjamini Y and Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B: Statistical Methodology, 57, 289–300.
    1. Canale A, Corradin R and Nipoti B (2019) Importance conditional sampling for Pitman-Yor mixtures. arXiv.
    1. Chen EZ and Li H (2016) A two-part mixed-effects model for analyzing longitudinal microbiome compositional data. Bioinformatics, 32, 2611–2617. - PMC - PubMed
    1. Dahl DB (2005) Sequentially-allocated merge-split sampler for conjugate and nonconjugate Dirichlet process mixture models. Technical Report, Department of Statistics, University of Winsconsin, Madison.

Publication types

LinkOut - more resources