Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 10;37(25):3661-3678.
doi: 10.1002/sim.7842. Epub 2018 Jul 16.

Multiple imputation in Cox regression when there are time-varying effects of covariates

Affiliations

Multiple imputation in Cox regression when there are time-varying effects of covariates

Ruth H Keogh et al. Stat Med. .

Abstract

In Cox regression, it is important to test the proportional hazards assumption and sometimes of interest in itself to study time-varying effects (TVEs) of covariates. TVEs can be investigated with log hazard ratios modelled as a function of time. Missing data on covariates are common and multiple imputation is a popular approach to handling this to avoid the potential bias and efficiency loss resulting from a "complete-case" analysis. Two multiple imputation methods have been proposed for when the substantive model is a Cox proportional hazards regression: an approximate method (Imputing missing covariate values for the Cox model in Statistics in Medicine (2009) by White and Royston) and a substantive-model-compatible method (Multiple imputation of covariates by fully conditional specification: accommodating the substantive model in Statistical Methods in Medical Research (2015) by Bartlett et al). At present, neither accommodates TVEs of covariates. We extend them to do so for a general form for the TVEs and give specific details for TVEs modelled using restricted cubic splines. Simulation studies assess the performance of the methods under several underlying shapes for TVEs. Our proposed methods give approximately unbiased TVE estimates for binary covariates with missing data, but for continuous covariates, the substantive-model-compatible method performs better. The methods also give approximately correct type I errors in the test for proportional hazards when there is no TVE and gain power to detect TVEs relative to complete-case analysis. Ignoring TVEs at the imputation stage results in biased TVE estimates, incorrect type I errors, and substantial loss of power in detecting TVEs. We also propose a multivariable TVE model selection algorithm. The methods are illustrated using data from the Rotterdam Breast Cancer Study. R code is provided.

Keywords: Cox regression; missing data; multiple imputation; restricted cubic spline; time-varying effect.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Time‐varying effect (TVE) functions used in simulation studies
Figure 2
Figure 2
Curve‐wise estimates of TVEs for covariate X 1 in the setting with binary covariates X 1 and X 2. The dotted black line indicates the true curve. MI, multiple imputation; SMC, substantive‐model‐compatible; TVE, time‐varying effect [Colour figure can be viewed at http://wileyonlinelibrary.com]
Figure 3
Figure 3
Curve‐wise estimates of TVEs for covariate X 1 in the setting with continuous covariates X 1 and X 2. The dotted black line indicates the true curve. MI, multiple imputation; SMC, substantive‐model‐compatible; TVE, time‐varying effect [Colour figure can be viewed at http://wileyonlinelibrary.com]
Figure 4
Figure 4
Results from additional simulations. The plots show the curvewise estimates of TVEs for covariate X 1. The dotted black line indicates the true curve. The tables shows the percentage of simulations in which the null hypotheses of proportional hazards for X 1 and X 2 were rejected using joint Wald tests. A, Scenario 4, binary X1 and X2: The probability of missingness in X1 and X2 depends on D; B, Scenario 4, binary X1 and X2: 50% of individuals had the event; C, Scenario 2, continuous X1 and X2: the proportion of individuals missing X1 or X2 was reduced to 20%. MI, multiple imputation; SMC, substantive‐model‐compatible; TVE, time‐varying effect [Colour figure can be viewed at http://wileyonlinelibrary.com]
Figure 5
Figure 5
Results from the Rotterdam Breast Cancer Study. Plots showing estimated log hazard ratios as a function of time from a complete‐data analysis, complete‐case analysis, and MI‐TVE‐SMC analysis. The time‐varying effects for all covariates were modelled using a restricted cubic spline with 5 knots. Shaded errors show 95% confidence intervals. Results are shown up to time 10. MI, multiple imputation; SMC, substantive‐model‐compatible; TVE, time‐varying effect [Colour figure can be viewed at http://wileyonlinelibrary.com]

Similar articles

Cited by

References

    1. Cox DR. Regression models and life tables. J R Stat Soc Ser B. 1972;34(2):187‐202.
    1. Cox DR. Partial likelihood. Biometrika. 1975;62(2):269‐276.
    1. Rubin DB. Multiple Imputation for Nonresponse in Surveys. Hoboken, NJ: John Wiley & Sons; 1987.
    1. Carpenter JR, Kenward MG. Multiple Imputation and its Application. Chichester, UK: John Wiley & Sons; 2013.
    1. White IR, Royston P. Imputing missing covariate values for the Cox model. Statist Med. 2009;28(15):1982‐1998. - PMC - PubMed

Publication types

LinkOut - more resources