. 2024;33(2):463-476.

doi: 10.1080/10618600.2023.2270720. Epub 2023 Nov 27.

Accurate and Ultra-Efficient p-Value Calculation for Higher Criticism Tests

Wenjia Wang¹, Yusi Fang¹, Chung Chang², George C Tseng¹

Affiliations

PMID: 39211031
PMCID: PMC11350355
DOI: 10.1080/10618600.2023.2270720

Accurate and Ultra-Efficient p-Value Calculation for Higher Criticism Tests

Wenjia Wang et al. J Comput Graph Stat. 2024.

. 2024;33(2):463-476.

doi: 10.1080/10618600.2023.2270720. Epub 2023 Nov 27.

Authors

Wenjia Wang¹, Yusi Fang¹, Chung Chang², George C Tseng¹

Affiliations

¹ Department of Biostatistics, University of Pittsburgh.
² Department of Applied Mathematics, National Sun Yat-sen University.

PMID: 39211031
PMCID: PMC11350355
DOI: 10.1080/10618600.2023.2270720

Abstract

In modern data science, higher criticism (HC) method is effective for detecting rare and weak signals. The computation, however, has long been an issue when the number of p-values combined ( $K$ ) and/or the number of repeated HC tests ( $N$ ) are large. Some computing methods have been developed, but they all have significant shortcomings, especially when a stringent significance level is required. In this paper, we propose an accurate and highly efficient computing strategy for four variations of HC. Specifically, we propose an unbiased cross-entropy-based importance sampling method ( ${IS}_{C E}$ ) to benchmark all existing computing methods, and develop a modified SetTest method (MST) that resolves numerical issues of the existing SetTest approach. We further develop an ultra-fast approach (UFI) combining pre-calculated statistical tables and cubic spline interpolation. Finally, following extensive simulations, we provide a computing strategy integrating MST, UFI and other existing methods with R package "HCp" for virtually any $K$ and small p-values ( $\sim 10^{- 20}$ ). The method is applied to a COVID-19 disease surveillance example for spatio-temporal outbreak detection from case numbers of 804 days in 3,342 counties in the United States. Results confirm viability of the computing strategy for large-scale inferences. Supplementary materials for this article are available online.

Keywords: analytical approximation; asymptotic rare and weak model; higher criticism; importance sampling; p-value computation.

PubMed Disclaimer

Figures

**Fig. 1**
(A) Left panel: the MST is recommended for $T_{H C}^{R_{T}}$ of arbitrary truncation with $K \leq 1000$ , while a hybrid of MST and Li-Siegmund is recommended for targeted p-value $\geq 10^{- 3}$ and p-value $< 10^{- 3}$ respectively when $K$ is above 1000. Right panel: a hybrid strategy combining the MST for $K \leq 100$ , naive Monte Carlo and the Li-Siegmund for targeted p-value $\geq 10^{- 3}$ and p-value $< 10^{- 3}$ respectively is recommended for $T_{H C}^{R_{T M}}$ of arbitrary truncation under $K > 100$ . (B) In our “HCp” R package, statistical tables are pre-calculated when $K \leq 2000$ and target p-value $> 10^{- 14}$ for the specific HC tests: (i) $T_{H C}^{R_{F}}$ and (ii) $T_{H C}^{R_{T}}$ with $k_{0} = 1, k_{1} = ⌊ K ∕ 2 ⌋$ (left panel), as well as (iii) $T_{H C}^{R_{M}}$ and (iv) $T_{H C}^{R_{T M}}$ With $k_{0} = 1, k_{1} = ⌊ K ∕ 2 ⌋$ of target p-value $> 10^{- 12}$ (right panel), where the UFI method is applicable and prefered.

**Fig. 2**
(A) illustrates the accuracy of the Barnett-Lin, SetTest, and MST method in computing small p-values of $T_{H C}^{R_{F}}$ with $K = 500$ benchmarked by ${IS}_{C E}$ method (the methods started with * in the legend is our proposed methods). The x-axis is the $\log$ value of $T_{H C}^{R_{F}}$ statistic, and the y-axis is the corresponding $- \log_{10}$ p-value. Similarly, (B) evaluates the accuracy of the SetTest and MST method benchmarked by the ${IS}_{C E}$ method for case (ii) $T_{H C}^{R_{T}}$ of $K = 500$ . (C) and (D) shows the accuracy of the MST benchmarked by the ${IS}_{C E}$ method for $T_{H C}^{R_{M}}$ and case (iv) $T_{H C}^{R_{T M}}$ respectively under $K = 100$ . The sampling size for ${IS}_{C E}$ is $M = 10^{4}$ .

**Fig. 3**
(A) and (C) illustrate the performance of the Li-Siegmund, MST, and naive Monte Carlo method with sampling size $M = 10^{6}$ in estimating large p-values for $T_{H C}^{R_{M}}$ and case (iv) $T_{H C}^{R_{T M}}$ of $K = 100$ respectively. The error bar of the naive Monte Carlo estimate represents the square root of mean square error at the original scale over 50 replications. The y-axis is the p-value at original scale. (B) and (D) illustrate the performance of Li-Siegmund, MST, and ${IS}_{C E}$ method with sampling size $M = 10^{6}$ in estimating small p-values for $T_{H C}^{R_{M}}$ and case (iv) $T_{H C}^{R_{T M}}$ of $K = 100$ respectively. The error bar of the ${IS}_{C E}$ estimate is the square root of mean square error at $\log_{10}$ scale over 50 replications and the y-axis represents the $- \log_{10}$ p-value.

**Fig. 4**
(A) compares the time (in second) consumed of computing the p-values of 100 different $T_{H C}^{R_{F}}$ tests by the Barnett-Lin, SetTest, Li-Siegmund, ${IS}_{C E}$ , MST and UFI method with respect to varying $K$ . The formula of the fitted polynomial curve for each method is labeled in the figure. (B) removes the Barnett-Lin method and compares the other five methods. (C) illustrates the consistent p-value estimate by the UFI method with the analytic truth by the MST method for $T_{H C}^{R_{F}}$ of various $K$ . The black straight line $y = x$ is the reference representing that the UFI estimates are exactly the same as the analytic truth.

**Fig. 5**
The maps show the significance level ( $- \log_{10}$ p-value) of the COVID-19 outbreak testing by the UFI method for each of the 3,342 counties in the US at three time periods: 03/22/2021-04/20/2021 (top); 11/17/2021-12/16/2021 (middle); 02/15/2022-03/16/2022 (bottom). There are more significant counties outbreaking COVID-19 at the end of 2021 and during the period from mid-February to mid-March in 2022.

See this image and copyright information in PMC

References

1. Barnett I, Mukherjee R, and Lin X. The generalized higher criticism for testing SNP-Set effects in genetic association studies. Journal of the American Statistical Association, 112, 06 2016. doi: 10.1080/01621459.2016.1192039. - DOI - PMC - PubMed
1. Barnett IJ and Lin X. Analytical p-value calculation for the higher criticism test in finite-d problems. Biometrika, 101(4):964–970, 08 2014. ISSN 0006-3444. doi: 10.1093/biomet/asu033. URL 10.1093/biomet/asu033. - DOI - DOI - PMC - PubMed
1. Berk R and Jones D. Goodness-of-fit test statistics that dominate the Kolmogorov statistics. Probability Theory and Related Fields, 47(1):47–59, Jan. 1979. ISSN 0178-8051. doi: 10.1007/BF00533250. - DOI
1. Cai TT and Wu Y. Optimal detection of sparse mixtures against a given null distribution. IEEE Transactions on Information Theory, 60(4):2217–2232, 2014. doi: 10.1109/TIT.2014.2304295. - DOI
1. Cayón L, Jin J, and Treaster A. Higher Criticism statistic: detecting and identifying non-Gaussianity in the WMAP first-year data. Monthly Notices of the Royal Astronomical Society, 362(3):826–832, 09 2005. ISSN 0035-8711. doi: 10.1111/j.1365-2966.2005.09277.x. URL 10.1111/j.1365-2966.2005.09277.x. - DOI - DOI

Grants and funding

R01 LM014142/LM/NLM NIH HHS/United States

LinkOut - more resources

Full Text Sources
- PubMed Central
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Accurate and Ultra-Efficient p-Value Calculation for Higher Criticism Tests

Affiliations

Accurate and Ultra-Efficient p-Value Calculation for Higher Criticism Tests

Authors

Affiliations

Abstract

Figures

Similar articles

References

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous

Abstract

Figures

Similar articles

References

Related information

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous