USP: an independence test that improves on Pearson's chi-squared and the G-test
- PMID: 35153605
- PMCID: PMC8652272
- DOI: 10.1098/rspa.2021.0549
USP: an independence test that improves on Pearson's chi-squared and the G-test
Abstract
We present the -statistic permutation (USP) test of independence in the context of discrete data displayed in a contingency table. Either Pearson's -test of independence, or the -test, are typically used for this task, but we argue that these tests have serious deficiencies, both in terms of their inability to control the size of the test, and their power properties. By contrast, the USP test is guaranteed to control the size of the test at the nominal level for all sample sizes, has no issues with small (or zero) cell counts, and is able to detect distributions that violate independence in only a minimal way. The test statistic is derived from a -statistic estimator of a natural population measure of dependence, and we prove that this is the unique minimum variance unbiased estimator of this population quantity. The practical utility of the USP test is demonstrated on both simulated data, where its power can be dramatically greater than those of Pearson's test, the -test and Fisher's exact test, and on real data. The USP test is implemented in the R package USP.
Keywords:
Fisher’s exact test; G-test; Pearson’s
© 2021 The Authors.
Figures
References
-
- Pearson K. 1900. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Phil. Mag. Ser. 5 50, 157-175. (Reprinted in: Karl Pearson’s Early Statistical Papers, Cambridge University Press, 1956). ( 10.1080/14786440009463897) - DOI
-
- Fisher RA. 1924. The conditions under which chi square measures the discrepancy between observations and hypothesis. J. R. Stat. Soc. 87, 442-450. ( 10.2307/2341292) - DOI
-
- Lehmann EL, Romano JP. 2005. Testing statistical hypotheses. New York, NY: Springer Science+Business Media, Inc..
-
- McDonald JH. 2014. -test of goodness-of-fit. In Handbook of biological statistics, 3rd edn., pp. 53–58. Baltimore, MD: Sparky House Publishing.
-
- Dunning T. 1993. Accurate methods for the statistics of surprise and coincidence. Comput. Linguist. 19, 61-74.
LinkOut - more resources
Full Text Sources