Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 May 29;20(1):134.
doi: 10.1186/s12874-020-01018-7.

How are missing data in covariates handled in observational time-to-event studies in oncology? A systematic review

Affiliations

How are missing data in covariates handled in observational time-to-event studies in oncology? A systematic review

Orlagh U Carroll et al. BMC Med Res Methodol. .

Abstract

Background: Missing data in covariates can result in biased estimates and loss of power to detect associations. It can also lead to other challenges in time-to-event analyses including the handling of time-varying effects of covariates, selection of covariates and their flexible modelling. This review aims to describe how researchers approach time-to-event analyses with missing data.

Methods: Medline and Embase were searched for observational time-to-event studies in oncology published from January 2012 to January 2018. The review focused on proportional hazards models or extended Cox models. We investigated the extent and reporting of missing data and how it was addressed in the analysis. Covariate modelling and selection, and assessment of the proportional hazards assumption were also investigated, alongside the treatment of missing data in these procedures.

Results: 148 studies were included. The mean proportion of individuals with missingness in any covariate was 32%. 53% of studies used complete-case analysis, and 22% used multiple imputation. In total, 14% of studies stated an assumption concerning missing data and only 34% stated missingness as a limitation. The proportional hazards assumption was checked in 28% of studies, of which, 17% did not state the assessment method. 58% of 144 multivariable models stated their covariate selection procedure with use of a pre-selected set of covariates being the most popular followed by stepwise methods and univariable analyses. Of 69 studies that included continuous covariates, 81% did not assess the appropriateness of the functional form.

Conclusion: While guidelines for handling missing data in epidemiological studies are in place, this review indicates that few report implementing recommendations in practice. Although missing data are present in many studies, we found that few state clearly how they handled it or the assumptions they have made. Easy-to-implement but potentially biased approaches such as complete-case analysis are most commonly used despite these relying on strong assumptions and where often more appropriate methods should be employed. Authors should be encouraged to follow existing guidelines to address missing data, and increased levels of expectation from journals and editors could be used to improve practice.

Keywords: Epidemiology; Missing data; Multiple imputation; Observational studies; Oncology; Survival; Time-to-event.

PubMed Disclaimer

Conflict of interest statement

The authors declare that they have no competing interests.

Figures

Fig. 1
Fig. 1
Flowchart of the inclusion process for studies into the review [10]
Fig. 2
Fig. 2
Breakdown of complete-case (CC) usage. The initial phase refers to those who used complete-case analysis when determining inclusion/exclusion of individuals to the study population
Fig. 3
Fig. 3
Breakdown of multiple imputation (MI) usage. 1 2 did not specify the type of multivariate MI model used, similarly 1 for univariate. 2 1 study ensured the sample size stayed the same for different models. 3 3 studies did not clearly state that they were using complete-case

Similar articles

Cited by

References

    1. Rubin DB. Multiple Imputation for Nonresponse in Surveys. United States of America: Wiley; 1987.
    1. Little RJA, Rubin DB. Statistical Analysis with Missing Data, 2nd edn. United States of America: Wiley; 2002.
    1. White IR, Royston P. Imputing missing covariate values for the Cox model. Stat Med. 2009;28(15):1982–98. - PMC - PubMed
    1. Bartlett JW, Seaman SR, White IR, Carpenter JR. Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model. Stat Methods Med Res. 2015;24(4):462–87. - PMC - PubMed
    1. Keogh Ruth H., Morris Tim P. Multiple imputation in Cox regression when there are time-varying effects of covariates. Statistics in Medicine. 2018;37(25):3661–3678. - PMC - PubMed

Publication types

LinkOut - more resources