. 2021 Feb 26;17(2):e1008728.

doi: 10.1371/journal.pcbi.1008728. eCollection 2021 Feb.

Estimating the cumulative incidence of SARS-CoV-2 with imperfect serological tests: Exploiting cutoff-free approaches

Judith A Bouman¹, Julien Riou², Sebastian Bonhoeffer¹, Roland R Regoes¹

Affiliations

¹ Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland.
² Institute of Social and Preventive Medicine (ISPM), University of Bern, Bern, Switzerland.

PMID: 33635863
PMCID: PMC7946301
DOI: 10.1371/journal.pcbi.1008728

Estimating the cumulative incidence of SARS-CoV-2 with imperfect serological tests: Exploiting cutoff-free approaches

Judith A Bouman et al. PLoS Comput Biol. 2021.

. 2021 Feb 26;17(2):e1008728.

doi: 10.1371/journal.pcbi.1008728. eCollection 2021 Feb.

Authors

Judith A Bouman¹, Julien Riou², Sebastian Bonhoeffer¹, Roland R Regoes¹

Affiliations

¹ Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland.
² Institute of Social and Preventive Medicine (ISPM), University of Bern, Bern, Switzerland.

PMID: 33635863
PMCID: PMC7946301
DOI: 10.1371/journal.pcbi.1008728

Abstract

Large-scale serological testing in the population is essential to determine the true extent of the current SARS-CoV-2 pandemic. Serological tests measure antibody responses against pathogens and use predefined cutoff levels that dichotomize the quantitative test measures into sero-positives and negatives and use this as a proxy for past infection. With the imperfect assays that are currently available to test for past SARS-CoV-2 infection, the fraction of seropositive individuals in serosurveys is a biased estimator of the cumulative incidence and is usually corrected to account for the sensitivity and specificity. Here we use an inference method-referred to as mixture-model approach-for the estimation of the cumulative incidence that does not require to define cutoffs by integrating the quantitative test measures directly into the statistical inference procedure. We confirm that the mixture model outperforms the methods based on cutoffs, leading to less bias and error in estimates of the cumulative incidence. We illustrate how the mixture model can be used to optimize the design of serosurveys with imperfect serological tests. We also provide guidance on the number of control and case sera that are required to quantify the test's ambiguity sufficiently to enable the reliable estimation of the cumulative incidence. Lastly, we show how this approach can be used to estimate the cumulative incidence of classes of infections with an unknown distribution of quantitative test measures. This is a very promising application of the mixture-model approach that could identify the elusive fraction of asymptomatic SARS-CoV-2 infections. An R-package implementing the inference methods used in this paper is provided. Our study advocates using serological tests without cutoffs, especially if they are used to determine parameters characterizing populations rather than individuals. This approach circumvents some of the shortcomings of cutoff-based methods at exactly the low cumulative incidence levels and test accuracies that we are currently facing in SARS-CoV-2 serosurveys.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

**Fig 1. Point-estimates of cumulative incidence using the cutoff-based methods and the mixture model.**
Each violin represents 50 *in silico* serosurveys conducted with cohorts of 10, 000 virtual individuals and the points represent the median values. (A) Point-estimates of the cumulative incidence. The dashed line indicates the true cumulative incidence we assumed in the simulations. Please note that the scale of the y-axis differs between the sub-figures. (B) Size of the 95% uncertainty intervals. For the Rogan-Gladen and mixture-model estimates, the uncertainty intervals are the 95% confidence intervals, which we calculated with the bootstrap method (see Methods). For the Bayesian estimators, the uncertainty interval are the 95% credible intervals.

**Fig 2. Estimated fold increases in cumulative incidence for the cutoff-based methods and the mixture model.**
In the simulated serosurveys, we assumed the cumulative incidence to increase from 1.5% to 15%, resulting in a true fold increase of 10 (dashed line). The violins show the distribution of 50 *in silico* serosurveys for both cumulative incidence levels conducted with cohorts of 10, 000 individuals and a test with an AUC-ROC value of 0.975. The dots indicate the median value.

**Fig 3. Statistical power of the mixture model.**
In all simulations, the number of control and case data is fixed to 5, 000 each and the true cumulative incidence level is 8%. (A) Statistical power versus the number of individuals in the serosurvey for varying levels of test accuracy (AUC-ROC). The power is calculated as the fraction of simulated serosurveys that resulted in a cumulative incidence estimate that is within 25% of the true cumulative incidence and for which the true cumulative incidence level lies within 2 standard deviations of the estimated value. Each point in the graph represents the result of 3, 000 *in silico* serosurveys. (B) The minimal number of virtual individuals necessary to obtain a statistical power of 0.9 over a range of AUC-ROC values.

**Fig 4. Effect of varying the number of control and case sera used to calibrate the serological test.**
(A) An example of the true distribution (solid lines) of the control (grey) and case (orange) sera, the data simulated from those distributions (histograms) and the inferred densities (dashed lines) used in the inference of the cumulative incidence. Here, 150 control and case sera have been simulated and the AUC-ROC value of the test is equal to 0.975. (B) Point estimates of cumulative incidence for various numbers of control and case sera used to calibrate the serological test and three AUC-ROC values. Each violin shows the distribution of the estimated cumulative incidence of 50 *in silico* serosurveys conducted with cohorts of 10, 000 virtual individuals. The red line shows the true cumulative incidence we assumed in the simulated serosurveys (8%). (C) Size of the 95% confidence intervals of the estimated cumulative incidences. (D) Statistical power versus the number of control and case sera used in the validation data for varying levels of test accuracy (AUC-ROC). Each point in the graph represents the result of 3, 000 *in silico* serosurveys. (E) The minimal number of virtual individuals necessary to obtain a statistical power of 0.9 over a range of number of control and case sera in the validation data.

**Fig 5. Conceptual figure on how a discrepancy between the test validation and serosurvey data can be detected.**
(A) Histograms of simulated validation data from controls and severe cases. (B) Histograms of simulated validation data from controls and severe and asymptomatic cases. (C) Histogram of simulated serosurvey data when all infections in a population are severe. (D) Histogram of simulated serosurvey data when one third of all cases is asymptomatic and two thirds severe.

**Fig 6. Estimates of the cumulative incidence in a population where individuals have been uninfected, as well as symptomatically and severely infected.**
The x-axes represent the AUC-ROC value between the asymptomatic and severe case distribution. The AUC-ROC value between the control and the severe case distributions is 1. Each violin represents the result of 50 simulated serosurveys with 10, 000 individuals per serosurvey. The true total cumulative incidence of severe and asymptomatic infections is 10%, of which 20% are asymptomatic. (A) Cyan violins show estimates of the total cumulative incidence based on an inferred case distribution containing only severe case sera, whereas purple violins show estimates where the case distribution is containing both asymptomatic and severe case sera. (B) The estimated cumulative incidence of the mild (light purple) and the severe (dark purple) cases, where the case sera distribution is only based on severe cases, but the likelihood equation also estimates the shape of the asympotomatic cases and their relative prevalence.

**Fig 7. Conceptual diagram of the distribution of the quantitative test measures for control and case sera.**
(A) Hypothetical probability density of quantitative test measures of control sera and three possible case sera distributions. (B) ROC-curves corresponding to the distribution of quantitative test measures of the control sera and each of the possible distributions for the case sera. (C) Visualization of the ‘maximal Youden’ and ‘high specificity’ cutoffs. (D) Visualization of the ‘maximal Youden’ and ‘high specificity’ cutoffs in the ROC curves.

See this image and copyright information in PMC

Cited by

Correcting for Antibody Waning in Cumulative Incidence Estimation From Sequential Serosurveys.
Kadelka S, Bouman JA, Ashcroft P, Regoes RR. Kadelka S, et al. Am J Epidemiol. 2024 May 7;193(5):777-786. doi: 10.1093/aje/kwad226. Am J Epidemiol. 2024. PMID: 38012125 Free PMC article.
A Bayesian approach to estimating COVID-19 incidence and infection fatality rates.
Slater JJ, Bansal A, Campbell H, Rosenthal JS, Gustafson P, Brown PE. Slater JJ, et al. Biostatistics. 2024 Apr 15;25(2):354-384. doi: 10.1093/biostatistics/kxad003. Biostatistics. 2024. PMID: 36881693 Free PMC article.
Estimating SARS-CoV-2 infection probabilities with serological data and a Bayesian mixture model.
Glemain B, de Lamballerie X, Zins M, Severi G, Touvier M, Deleuze JF; SAPRIS-SERO study group; Lapidus N, Carrat F. Glemain B, et al. Sci Rep. 2024 Apr 25;14(1):9503. doi: 10.1038/s41598-024-60060-3. Sci Rep. 2024. PMID: 38664455 Free PMC article.
A Mixture Model for Estimating SARS-CoV-2 Seroprevalence in Chennai, India.
Hitchings MDT, Patel EU, Khan R, Srikrishnan AK, Anderson M, Kumar KS, Wesolowski AP, Iqbal SH, Rodgers MA, Mehta SH, Cloherty G, Cummings DAT, Solomon SS. Hitchings MDT, et al. Am J Epidemiol. 2023 Sep 1;192(9):1552-1561. doi: 10.1093/aje/kwad103. Am J Epidemiol. 2023. PMID: 37084085 Free PMC article.
Estimating cutoff values for diagnostic tests to achieve target specificity using extreme value theory.
Pugh S, Fosdick BK, Nehring M, Gallichotte EN, VandeWoude S, Wilson A. Pugh S, et al. BMC Med Res Methodol. 2024 Feb 8;24(1):30. doi: 10.1186/s12874-023-02139-5. BMC Med Res Methodol. 2024. PMID: 38331732 Free PMC article.

See all "Cited by" articles

References

1. Johns Hopkins Center for Health Security. Global Progress on COVID-19 Serology-Based Testing Johns Hopkins Center for Health Security. 2020 Apr 13. URL: http://www.centerforhealthsecurity.org/resources/COVID-19/serology/Serol....
1. Lin D, Liu L, Zhang M, Hu Y, Yang JG, Dai Y, et al.. Evaluations of the serological test in the diagnosis of 2019 novel coronavirus (SARS-CoV-2) infections during the COVID-19 outbreak. Eur J Clin Microbiol Infect Dis. 2020;39(12):2271–2277. 10.1007/s10096-020-03978-6 - DOI - PMC - PubMed
1. Kontou PI, Braliou GG, Dimou NL, Nikolopoulos G, Bagos PG. Antibody Tests in Detecting SARS-CoV-2 Infection: A Meta-Analysis. Diagnostics (Basel). 2020. May 19;10(5):319. 10.3390/diagnostics10050319 - DOI - PMC - PubMed
1. GeurtsvanKessel CH, Okba NMA, Igloi Z, Bogers S, Embregts CWE, Laksono BM, et al.. An evaluation of COVID-19 serological assays informs future diagnostics and exposure assessment. Nat Commun. 2020. July 6;11(1):3436. 10.1038/s41467-020-17317-y - DOI - PMC - PubMed
1. Theel ES, Harring J, Hilgart H, Granger D. Performance Characteristics of Four High-Throughput Immunoassays for Detection of IgG Antibodies against SARS-CoV-2. J Clin Microbiol. 2020. July 23;58(8):e01243–20. 10.1128/JCM.01243-20 - DOI - PMC - PubMed

Publication types

Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Substances

Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations
Medical
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Estimating the cumulative incidence of SARS-CoV-2 with imperfect serological tests: Exploiting cutoff-free approaches

Affiliations

Estimating the cumulative incidence of SARS-CoV-2 with imperfect serological tests: Exploiting cutoff-free approaches

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

Publication types

MeSH terms

Substances

LinkOut - more resources

Full Text Sources

Other Literature Sources

Medical

Miscellaneous