Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Sep;108(3):575-590.
doi: 10.1093/biomet/asaa086. Epub 2020 Oct 14.

Hypotheses on a tree: new error rates and testing strategies

Affiliations

Hypotheses on a tree: new error rates and testing strategies

Marina Bogomolov et al. Biometrika. 2021 Sep.

Abstract

We introduce a multiple testing procedure that controls global error rates at multiple levels of resolution. Conceptually, we frame this problem as the selection of hypotheses that are organized hierarchically in a tree structure. We describe a fast algorithm and prove that it controls relevant error rates given certain assumptions on the dependence between the p-values. Through simulations, we demonstrate that the proposed procedure provides the desired guarantees under a range of dependency structures and that it has the potential to gain power over alternative methods. Finally, we apply the method to studies on the genetic regulation of gene expression across multiple tissues and on the relation between the gut microbiome and colorectal cancer.

Keywords: False discovery rate; Hierarchical testing; Multiple testing; Selective inference.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Hierarchical structure of hypotheses in a four-level tree. Circles represent true null hypotheses, while squares denote false nulls. Children of the same parent constitute a family of hypotheses. To give an example of the sequential order of testing, nodes corresponding to tested hypotheses are unfilled, while grey nodes indicate hypotheses that are not tested. A red border distinguishes rejected hypotheses. Tested families are enclosed within dashed borders, with some labelled as ij to illustrate the notation.
Fig. 2.
Fig. 2.
Illustration of the bottom-up calculation of the proposed error rate for level 4, sfdr4, using the same configuration of hypotheses as in Fig. 1. The error measure 𝒠j(4) is defined for rejected hypotheses and indicated by the red number in the node corresponding to each rejection. The hypotheses not distinguished by red borders are not rejected and so do not receive any error measure. If the rejections are nodes at the level of interest, which is level 4 in this illustration, the error measure is 1 for an incorrect rejection and 0 otherwise. For a node at a higher level, the error measure is the average of the error measures assigned to its children if it has one or more rejected child hypotheses and is 0 otherwise.
Fig. 3.
Fig. 3.
Results for the example. Each point corresponds to the average of 1000 realizations. Dashed horizontal lines indicate the target values for the error rates. The methods under comparison are the Benjamini–Hochberg procedure (orange diamonds), the Benjamini–Bogomolov method (red squares), the nonhierarchical version of the p-filter (pink circles), the hierarchical version of the p-filter (purple circles) and TreeBH (blue triangles).
Fig. 4.
Fig. 4.
Taxonomic tree of selections obtained using the TreeBH procedure. Additional discoveries of TreeBH that were not found with the Benjamini–Hochberg procedure are marked in red.

References

    1. Benjamini Y & Bogomolov M (2014). Selective inference on multiple families of hypotheses. J. R. Statist. Soc. B 76, 297–318.
    1. Benjamini Y & Heller R (2007). False discovery rates for spatial signals. J. Am. Statist. Assoc 102, 1272–81.
    1. Benjamini Y & Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. J. R. Statist. Soc. B 57, 289–300.
    1. Benjamini Y & Yekutieli D (2001). The control of the false discovery rate in multiple testing under dependency. Ann. Statist 29, 1165–88.
    1. Brzyski D, Peterson CB, Sobczyk P, Candes EJ, Bogdan M & Sabatti C (2017). Controlling the rate of GWAS false discoveries. Genetics 205, 61–75. - PMC - PubMed