Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013 Sep-Oct;20(5):514-9.
doi: 10.1097/MJT.0b013e3181ff7a7b.

Clinical research: a novel approach to regression substitution for handling missing data

Affiliations

Clinical research: a novel approach to regression substitution for handling missing data

Eugene P E M Cleophas et al. Am J Ther. 2013 Sep-Oct.

Abstract

In clinical research, missing data are common. Imputed data are not real data but constructed values that should increase the sensitivity of testing. Regression substitution for the purpose of data imputation often did not provide a better sensitivity than did other methods. The objective of this study was to compare different methods of missing data imputation with that of regression substitution taking into account particular quality measures. A real data example with a 105-value file was used. After randomly removing 5 values from the file, mean imputation and hot deck imputation were compared with regression substitution, taking account of the following requirements: (1) at least 2 independent variables be present in the equation, (2) no more than 1 datum per patient be missing, (3) no more than 5% of the data be missing, (4) more than 5% of the data be missing after randomly choosing 5% for regression-substitution deletion of the remainder, (5) only statistically significant variables be present in the regression model, and (6) no random errors be added to the imputed data. The test statistics after regression substitution were much better than those after the other 2 methods with F-values of 44.1 vs 29.4 and 30.1, and t-values of 7.6 vs 5.6 and 5.7, and 3.0 vs 1.7 and 1.8. We conclude that regression substitution is a very sensitive method for imputing missing data provided particular quality measures are taken into account.

PubMed Disclaimer

MeSH terms

LinkOut - more resources