Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Oct:86:49-56.e3.
doi: 10.1016/j.annepidem.2023.06.023. Epub 2023 Jul 7.

A flexible matching strategy for matched nested case-control studies

Affiliations

A flexible matching strategy for matched nested case-control studies

Andrew Ratanatharathorn et al. Ann Epidemiol. 2023 Oct.

Abstract

Purpose: Individual matching in case-control studies improves statistical efficiency over random selection of controls but can lead to selection bias if cases are excluded due to the lack of appropriate controls or residual confounding with less strict matching criteria. We introduce flex matching, an algorithm using multiple rounds of control selection with successively relaxed matching criteria to select controls for cases.

Methods: We simulated exposure-disease relationships in multiple cohort data sets with a range of confounding scenarios and conducted 16,800,000 nested case-control studies, comparing random selection of controls, strict matching, and flex matching. We computed average bias and statistical efficiency in estimates of exposure-disease relationships under each matching strategy.

Results: On average, flex matching produced the least biased estimates of exposure-disease associations with the smallest standard errors. Strict matching algorithms that excluded cases for whom matched controls could not be identified produced biased estimates with larger standard errors. Estimates from studies with random assignment of controls were relatively unbiased, but the standard errors were larger than from studies using flex matching.

Conclusions: Flex matching should be considered for case-control designs, especially for biomarker studies where matching on technical artifacts is necessary and maximizing efficiency is a priority.

Keywords: Bias; Confounding; Efficiency; Matching; Nested case-control studies.

PubMed Disclaimer

Conflict of interest statement

Declaration of Competing Interest The authors declare the following financial interests/personal relationships, which may be considered potential competing interests: Andrew G. Rundle reports financial support was provided by the National Institute of Environmental Health Sciences and the National Cancer Institute. Benjamin A. Rybicki reports financial support was provided by the National Institute of Environmental Health Sciences. Stephen J. Mooney reports financial support was provided by the National Institute of Child Health and Human Development and the EHE International.

Figures

Figure 1.
Figure 1.
Iterative flex matching algorithm for matching cases and controls over X rounds with relaxed matching criteria. Investigators begin with their preferred matching criteria, and remaining unmatched cases are paired to controls with relaxed matching criteria over X rounds. Finally, for cases who remain without matched controls after X rounds of matching, controls are randomly selected.
Figure 2.
Figure 2.
Flex matching algorithm used to create 14 sets of 50 nested case control studies from each simulated cohort. The first set of 50 case-control studies per cohort used the strict matching criteria and then the remaining cases had controls randomly assigned to them. The subsequent sets of case-control studies had increasing rounds of matching criteria relaxation applied to identify match-able controls before the remaining cases had controls randomly assigned to them.
Figure 3.
Figure 3.
β-coefficients of exposure averaged over 50 case-control studies of 500 simulated exposures associated with varying strengths of association between the exposure with Age and ln(PSA).
Figure 4.
Figure 4.
Standard Errors for the β-coefficient of exposure averaged over 50 case-control studies of 500 simulated exposures associated with varying strengths of association between the exposure with Age and ln(PSA).
Figure 5.
Figure 5.
Box plot of estimated β-coefficients from 50 case-control studies conducted for 500 simulated exposures with an odds ratio of age on exposure of 1.01 and an odds ratio of ln(PSA) on exposure of 1.5.
Figure 6.
Figure 6.
Box plot of estimated standard errors for the β-coefficients from 50 case-control studies conducted for 500 simulated exposures with an odds ratio of age on exposure of 1.01 and an odds ratio of ln(PSA) on exposure of 1.5.

Similar articles

Cited by

References

    1. Arbuckle TE. Maternal-infant biomonitoring of environmental chemicals: the epidemiologic challenges. Birth defects research Part A, Clinical and molecular teratology. 2010;88(10):931–7. - PubMed
    1. Bertke S, Hein M, Schubauer-Berigan M, et al. A Simulation Study of Relative Efficiency and Bias in the Nested Case-Control Study Design. Epidemiol Methods. 2013;2(1):852–93. - PMC - PubMed
    1. Demirtas H, Amatya A, Doganay B. BinNor: An R package for concurrent generation of binary and normal data. Communications in Statistics-Simulation and Computation. 2014;43(3):569–79.
    1. Greenland S Small-sample bias and corrections for conditional maximum-likelihood odds-ratio estimators. Biostatistics. 2000;1(1):113–22. - PubMed
    1. Kryvenko ON, Jankowski M, Chitale DA, et al. Inflammation and preneoplastic lesions in benign prostate as risk factors for prostate cancer. Modern pathology : an official journal of the United States and Canadian Academy of Pathology, Inc. 2012;25(7):1023–32. - PMC - PubMed

Publication types

LinkOut - more resources