Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Oct;31(10):1860-1880.
doi: 10.1177/09622802221102623. Epub 2022 Jun 5.

Multiple imputation for cause-specific Cox models: Assessing methods for estimation and prediction

Affiliations

Multiple imputation for cause-specific Cox models: Assessing methods for estimation and prediction

Edouard F Bonneville et al. Stat Methods Med Res. 2022 Oct.

Abstract

In studies analyzing competing time-to-event outcomes, interest often lies in both estimating the effects of baseline covariates on the cause-specific hazards and predicting cumulative incidence functions. When missing values occur in these baseline covariates, they may be discarded as part of a complete-case analysis or multiply imputed. In the latter case, the imputations may be performed either compatibly with a substantive model pre-specified as a cause-specific Cox model [substantive model compatible fully conditional specification (SMC-FCS)], or approximately so [multivariate imputation by chained equations (MICE)]. In a large simulation study, we assessed the performance of these three different methods in terms of estimating cause-specific regression coefficients and predicting cumulative incidence functions. Concerning regression coefficients, results provide further support for use of SMC-FCS over MICE, particularly when covariate effects are large and the baseline hazards of the competing events are substantially different. Complete-case analysis also shows adequate performance in settings where missingness is not outcome dependent. With regard to cumulative incidence prediction, SMC-FCS and MICE are performed more similarly, as also evidenced in the illustrative analysis of competing outcomes following a hematopoietic stem cell transplantation. The findings are discussed alongside recommendations for practising statisticians.

Keywords: Competing risks; Cox model; cause-specific hazards; missing covariates; multiple imputation; substantive model compatible imputation.

PubMed Disclaimer

Conflict of interest statement

Declaration of conflicting interests: The authors declare that there is no conflict of interest.

Figures

Figure 1.
Figure 1.
Bias in β1 for MAR mechanism with continuous X . Each cluster of points corresponds to a scenario defined by the step functions at the bottom of the plot. Each step represents a level of a factor being varied and is read from left to right (e.g. for Hazard shapes, the first step is ‘similar’ while the second is ‘different’). Monte-Carlo standard errors of bias for all scenarios were below 0.008. Mech.: missingness mechanism; MIR: missing at random.
Figure 2.
Figure 2.
Bias in β1 for MAR-T mechanism with continuous X . Monte-Carlo standard errors of bias for all scenarios were below 0.008. Refer to Figure 1 for a description on how to read this type of plot. Mech.: missingness mechanism; MAR-T: outcome-dependent missing at random.
Figure 3.
Figure 3.
RMSE of 5-year REL and NRM probabilities with {X,Z}={1,1} for MAR with 50% missing values. Monte-Carlo standard errors of RMSE for all scenarios were below 0.002. Refer to Figure 1 for a description on how to read this type of plot. Mech.: missingness mechanism; RMSE: root mean square error; REL: relapse; MAR: missing at random; NRM: non-relapse mortality.
Figure 4.
Figure 4.
Forest plot with point estimates and 95% confidence interval for the cause-specific Cox model for Relapse. On the x -axis are the hazard ratios, which is plotted on the log scale where the confidence intervals are symmetric. Variables and their descriptions can be found in the data dictionary. Per level of factor and for continuous variables, we show the observed counts ( n ) and the number of relapse events (# Events) in the full data set.
Figure 5.
Figure 5.
Forest plot with point estimates and 95% confidence interval for the cause-specific Cox model for non-relapse mortality (NRM). On the x -axis are the hazard ratios, which is plotted on the log scale where the confidence intervals are symmetric. Variables and their descriptions can be found in the data dictionary. Per level of factor and for continuous variables, we show the observed counts ( n ) and the number of NRM events (# Events) in the full dataset.

References

    1. Carroll OU, Morris TP, Keogh RH. How are missing data in covariates handled in observational time-to-event studies in oncology? A systematic review. BMC Med Res Methodol 2020; 20: 134. - PMC - PubMed
    1. White IR, Carlin JB. Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values. Stat Med 2010; 29: 2920–2931. - PubMed
    1. Murray JS. Multiple imputation: A review of practical and theoretical findings. Stat Sci 2018; 33: 142–159.
    1. Rubin DB. Multiple Imputation for Nonresponse in Surveys. New York: Wiley, 1987.
    1. van Buuren S, Brand JPL, Groothuis-Oudshoorn CGM. et al. Fully conditional specification in multivariate imputation. J Stat Comput Simul 2006; 76: 1049–1064.

MeSH terms