Meta-Analysis

. 2022 Sep;43(13):3987-3997.

doi: 10.1002/hbm.25898. Epub 2022 May 10.

Evaluation of thresholding methods for activation likelihood estimation meta-analysis via large-scale simulations

Lennart Frahm^{1

2}, Edna C Cieslik^{2

3}, Felix Hoffstaedter^{2

3}, Theodore D Satterthwaite^{4

5}, Peter T Fox⁶, Robert Langner^{2

3}, Simon B Eickhoff^{2

3}

Affiliations

¹ Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Aachen, Germany.
² Institute of Neuroscience and Medicine (INM7: Brain and Behavior), Research Centre Jülich, Jülich, Germany.
³ Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
⁴ Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
⁵ Penn Lifespan Informatics and Neuroimaging Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
⁶ Research Imaging Institute, University of Texas Health Science Center, San Antonio, Texas, USA.

PMID: 35535616
PMCID: PMC9374884
DOI: 10.1002/hbm.25898

Meta-Analysis

Evaluation of thresholding methods for activation likelihood estimation meta-analysis via large-scale simulations

Lennart Frahm et al. Hum Brain Mapp. 2022 Sep.

. 2022 Sep;43(13):3987-3997.

doi: 10.1002/hbm.25898. Epub 2022 May 10.

Authors

Lennart Frahm^{1

2}, Edna C Cieslik^{2

3}, Felix Hoffstaedter^{2

3}, Theodore D Satterthwaite^{4

5}, Peter T Fox⁶, Robert Langner^{2

3}, Simon B Eickhoff^{2

3}

Affiliations

¹ Department of Psychiatry, Psychotherapy and Psychosomatics, School of Medicine, RWTH Aachen University, Aachen, Germany.
² Institute of Neuroscience and Medicine (INM7: Brain and Behavior), Research Centre Jülich, Jülich, Germany.
³ Institute of Systems Neuroscience, Medical Faculty, Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
⁴ Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
⁵ Penn Lifespan Informatics and Neuroimaging Center, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
⁶ Research Imaging Institute, University of Texas Health Science Center, San Antonio, Texas, USA.

PMID: 35535616
PMCID: PMC9374884
DOI: 10.1002/hbm.25898

Abstract

In recent neuroimaging studies, threshold-free cluster enhancement (TFCE) gained popularity as a sophisticated thresholding method for statistical inference. It was shown to feature higher sensitivity than the frequently used approach of controlling the cluster-level family-wise error (cFWE) and it does not require setting a cluster-forming threshold at voxel level. Here, we examined the applicability of TFCE to a widely used method for coordinate-based neuroimaging meta-analysis, Activation Likelihood Estimation (ALE), by means of large-scale simulations. We created over 200,000 artificial meta-analysis datasets by independently varying the total number of experiments included and the amount of spatial convergence across experiments. Next, we applied ALE to all datasets and compared the performance of TFCE to both voxel-level and cluster-level FWE correction approaches. All three multiple-comparison correction methods yielded valid results, with only about 5% of the significant clusters being based on spurious convergence, which corresponds to the nominal level the methods were controlling for. On average, TFCE's sensitivity was comparable to that of cFWE correction, but it was slightly worse for a subset of parameter combinations, even after TFCE parameter optimization. cFWE yielded the largest significant clusters, closely followed by TFCE, while voxel-level FWE correction yielded substantially smaller clusters, showcasing its high spatial specificity. Given that TFCE does not outperform the standard cFWE correction but is computationally much more expensive, we conclude that employing TFCE for ALE cannot be recommended to the general user.

Keywords: FWE; family-wise error; multiple comparison correction; neuroimaging meta-analysis; significance thresholding; threshold-free cluster enhancement cluster extent.

PubMed Disclaimer

Conflict of interest statement

The authors declare no potential conflict of interest.

Figures

**FIGURE 1**
Simulation of an experiment. Two independent draws from the filtered Brainmap database were used to determine the sample size and the number of foci reported by the experiment. Next, we sampled the corresponding number of coordinates from a lenient gray‐matter mask. Last, the first coordinate got replaced by the true coordinate multiplied with a displacement factor. This last step only happened if the experiment was an experiment activating the target location

**FIGURE 2**
Behavior of ALE scores and the corresponding p‐values under the different levels of the two simulation parameters (number of experiments and number of experiments activating the target location) and their 341 combinations. The total number of experiments included in the ALE analysis is color coded in a spectral sequence from 15 experiments (purple) to 45 experiments (red). (a) Average ALE‐score (over 500 iterations) at the ground‐truth location. ALE scores increased linearly as a function of the number of experiments activating the target location but also with the total number of experiments due to the increased chance of (positive) interference by noise foci. (b) Average p‐value over 500 iterations at the ground‐truth location. p‐values decreased with a higher number of experiments activating the target location. p‐values increased with the total number of experiments because of a right shift of the null‐distribution. (c) ALE scores versus p‐values at the ground‐truth location for all 170,500 simulations. The more experiments are included in an ALE analysis, the more convergence (higher ALE score) was needed to obtain the same p‐values.

**FIGURE 3**
(a–c) Sensitivity of ALE when applying different multiple‐comparison correction methods for statistical inference. The number of experiments activating the target location is represented on the x‐axis, while each total number of experiments has its own curve in the graph following a spectral color sequence (15—purple; 45—red). The curves show the average sensitivity over the 500 iterations of each parameter combination. For all three methods, sensitivity increased in an approximately sigmoid fashion as a function of the number of experiments activating the target location. Additionally, having more experiments in the dataset required having more experiments activating the target location to achieve the same sensitivity. (d) Zooming in on the difference in sensitivity between cFWE correction and TFCE: The differences between individual dataset sizes are displayed in gray and the average over all dataset sizes in red. cFWE correction performed better on average, especially between 4 and 8 experiments activating the target location. There were a few dataset sizes in which TFCE has a slight sensitivity advantage at 3–4 experiments activating the target location.

**FIGURE 4**
(a–c) Cluster size of statistically significant areas of convergence that include at least one voxel in a 4‐mm radius around the true location, under the different levels of the two simulation parameters (number of experiments and number of experiments activating the target location) and their 341 combinations. The number of experiments activating the target location was strongly positively correlated with cluster size, while the total number of experiments showed a negative correlation. cFWE correction featured the largest clusters closely followed by TFCE. The clusters declared significant by vFWE correction were exceedingly small in comparison. (d) Zooming in on the difference in cluster sizes between cFWE and TFCE corrections, it can be observed that the difference became more pronounced with fewer experiments activating the target location. This is because cFWE correction will always only result in relatively large clusters, while TFCE can potentially yield single significant voxels. This difference was more pronounced at lower convergence levels.

**FIGURE 5**
The likelihood of additional significant clusters as a function of the number of experiments activating the target location, averaged across the total number of experiments (blue line). As can be seen, all three multiple‐comparison correction methods largely succeeded at controlling for an alpha error of .05.

**FIGURE 6**
(a) Sensitivity of ALE in a large‐scale meta‐analysis setting when applying different multiple comparison correction methods for statistical inference. The general trend observed in the main simulations holds for large‐scale datasets as well. Sensitivity increased as a function of experiments activating the target location in a sigmoid fashion and the lower the rate of experiments activating the target location is in comparison to the total number of experiments, the lower sensitivity became. Lower right: Zooming in on the difference between cFWE and TFCE corrections: Even though TFCE performed slightly better at 7 and 8 experiments activating the target location for datasets including 150 studies, cFWE showed higher sensitivity on average. (b) Sensitivity of ALE corrected with TFCE using different parameter levels for the E and H exponent looking at a dataset of n = 30 experiments. The standard setting (indicated in red), described in the literature as a fixed setting, is H = 2 and E = 0.5. We used combinations of H = [1.8, 2.0, 2.2] with E = [0.3, 0.5, 0.7] (indicated in gray) to see if other values would improve performance. Overall, the standard parameter setting performed best or at least on par with other parameter settings.

**FIGURE 7**
Computation time required for a single null permutation of each multiple‐comparison correction method. Times measured for 50 datasets per dataset size (15–45), totaling 1550 timepoints. vFWE and cFWE corrections run almost equally fast, while TFCE takes up to nine times as much time as the other two methods.

See this image and copyright information in PMC

Cited by

Consistent activation differences versus differences in consistent activation: Evaluating meta-analytic contrasts.
Küppers V, Cieslik EC, Frahm L, Hoffstaedter F, Eickhoff SB, Langner R, Müller VI. Küppers V, et al. Imaging Neurosci (Camb). 2024 Nov 8;2:imag-2-00358. doi: 10.1162/imag_a_00358. eCollection 2024. Imaging Neurosci (Camb). 2024. PMID: 40800475 Free PMC article.
Predictive modeling of significance thresholding in activation likelihood estimation meta-analysis.
Frahm L, Patil KR, Satterthwaite TD, Fox PT, Eickhoff SB, Langner R. Frahm L, et al. Imaging Neurosci (Camb). 2025 Jan 10;3:imag_a_00423. doi: 10.1162/imag_a_00423. eCollection 2025. Imaging Neurosci (Camb). 2025. PMID: 40800822 Free PMC article.
Distinct neural networks of task engagement and choice response in moral, risky, and ambiguous decision-making: An ALE meta-analysis.
Ambrase A, Müller VI, Camilleri JA, Wong HY, Derntl B. Ambrase A, et al. Imaging Neurosci (Camb). 2024 Aug 30;2:imag-2-00277. doi: 10.1162/imag_a_00277. eCollection 2024. Imaging Neurosci (Camb). 2024. PMID: 40800307 Free PMC article.
Exploration of brain imaging biomarkers in subthreshold depression patients across different ages: an ALE meta-analysis based on MRI studies.
Zhao B, Liu Z, He Y, Hu Y, Li Z, Cao L, Liang C, Yao R, Yin L, Wu J. Zhao B, et al. BMC Psychiatry. 2025 Mar 3;25(1):191. doi: 10.1186/s12888-025-06495-y. BMC Psychiatry. 2025. PMID: 40033236 Free PMC article.
Convergent abnormality in the subgenual anterior cingulate cortex in insomnia disorder: A revisited neuroimaging meta-analysis of 39 studies.
Reimann GM, Küppers V, Camilleri JA, Hoffstaedter F, Langner R, Laird AR, Fox PT, Spiegelhalder K, Eickhoff SB, Tahmasian M. Reimann GM, et al. Sleep Med Rev. 2023 Oct;71:101821. doi: 10.1016/j.smrv.2023.101821. Epub 2023 Jul 15. Sleep Med Rev. 2023. PMID: 37481961 Free PMC article. Review.

See all "Cited by" articles

References

1. Acar, F. , Seurinck, R. , Eickhoff, S. B. , & Moerkerke, B. (2018). Assessing robustness against potential publication bias in Activation Likelihood Estimation (ALE) meta‐analyses for fMRI. PLoS One, 13(11), e0208177. 10.1371/journal.pone.0208177 - DOI - PMC - PubMed
1. Baker, M. (2016). 1,500 scientists lift the lid on reproducibility. Nature News, 533(7604), 452–454. - PubMed
1. Bossier, H. , Roels, S. P. , Seurinck, R. , Banaschewski, T. , Barker, G. J. , Bokde, A. L. , Quinlan, E. B. , Desrivières, S. , Flor, H. , & Grigis, A. (2020). The empirical replicability of task‐based fMRI as a function of sample size. NeuroImage, 212, 116601. - PubMed
1. Botvinik‐Nezer, R. , Holzmeister, F. , Camerer, C. F. , Dreber, A. , Huber, J. , Johannesson, M. , Kirchler, M. , Iwanir, R. , Mumford, J. A. , Adcock, R. A. , Avesani, P. , Baczkowski, B. M. , Bajracharya, A. , Bakst, L. , Ball, S. , Barilari, M. , Bault, N. , Beaton, D. , Beitner, J. , … Schonberg, T. (2020). Variability in the analysis of a single neuroimaging dataset by many teams. Nature, 582(7810), 84–88. - PMC - PubMed
1. Chen, X. , Lu, B. , & Yan, C. G. (2018). Reproducibility of R‐fMRI metrics on the impact of different strategies for multiple comparison correction and sample sizes. Human Brain Mapping, 39(1), 300–318. - PMC - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Evaluation of thresholding methods for activation likelihood estimation meta-analysis via large-scale simulations

Affiliations

Evaluation of thresholding methods for activation likelihood estimation meta-analysis via large-scale simulations

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Related information

Grants and funding

LinkOut - more resources

Full Text Sources