Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2023 Dec 7;23(1):287.
doi: 10.1186/s12874-023-02090-5.

On the use of multiple imputation to address data missing by design as well as unintended missing data in case-cohort studies with a binary endpoint

Affiliations

On the use of multiple imputation to address data missing by design as well as unintended missing data in case-cohort studies with a binary endpoint

Melissa Middleton et al. BMC Med Res Methodol. .

Abstract

Background: Case-cohort studies are conducted within cohort studies, with the defining feature that collection of exposure data is limited to a subset of the cohort, leading to a large proportion of missing data by design. Standard analysis uses inverse probability weighting (IPW) to address this intended missing data, but little research has been conducted into how best to perform analysis when there is also unintended missingness. Multiple imputation (MI) has become a default standard for handling unintended missingness and is typically used in combination with IPW to handle the intended missingness due to the case-control sampling. Alternatively, MI could be used to handle both the intended and unintended missingness. While the performance of an MI-only approach has been investigated in the context of a case-cohort study with a time-to-event outcome, it is unclear how this approach performs with a binary outcome.

Methods: We conducted a simulation study to assess and compare the performance of approaches using only MI, only IPW, and a combination of MI and IPW, for handling intended and unintended missingness in the case-cohort setting. We also applied the approaches to a case study.

Results: Our results show that the combined approach is approximately unbiased for estimation of the exposure effect when the sample size is large, and was the least biased with small sample sizes, while MI-only and IPW-only exhibited larger biases in both sample size settings.

Conclusions: These findings suggest that a combined MI/IPW approach should be preferred to handle intended and unintended missing data in case-cohort studies with binary outcomes.

Keywords: Case-cohort study; Inverse probability weighting; Missing data; Multiple imputation; Simulation study.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1
Missingness directed acyclic graph (m-DAG) depicting the assumed causal relationships between generated variables and their missingness indicators. Dashed lines represent associations present under the dependent missing data mechanism, but not the independent missing data mechanism
Fig. 2
Fig. 2
Relative bias (%) in the estimated coefficient for the target parameter across the 26 simulated scenarios
Fig. 3
Fig. 3
Empirical standard error for the target parameter for each of the 26 simulated scenarios
Fig. 4
Fig. 4
Coverage probability of the 95% confidence interval for each of the 26 simulated scenarios
Fig. 5
Fig. 5
Estimated risk ratio and 95% confidence interval for the adjusted association between food allergy and vitamin D insufficiency estimated using the case study data

Similar articles

References

    1. Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73(1):1–11. doi: 10.1093/biomet/73.1.1. - DOI
    1. Cologne J, Preston DL, Imai K, Misumi M, Yoshida K, Hayashi T, Nakachi K. Conventional case-cohort design and analysis for studies of interaction. Int J Epidemiol. 2012;41(4):1174–1186. doi: 10.1093/ije/dys102. - DOI - PubMed
    1. Lumley T. Complex surveys: a guide to analysis using R. 1. Hoboken, NJ: Wiley; 2010.
    1. Seaman SR, White IR. Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res. 2013;22(3):278–295. doi: 10.1177/0962280210395740. - DOI - PubMed
    1. Rubin DB: Multiple imputation for nonresponse in surveys, 1st edn. New York: Wiley; 1987.

Publication types

LinkOut - more resources