Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2008 Jul 10;27(15):2826-49.
doi: 10.1002/sim.3111.

Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Affiliations

Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

Xiaowei Yang et al. Stat Med. .

Abstract

Biomedical research is plagued with problems of missing data, especially in clinical trials of medical and behavioral therapies adopting longitudinal design. After a literature review on modeling incomplete longitudinal data based on full-likelihood functions, this paper proposes a set of imputation-based strategies for implementing selection, pattern-mixture, and shared-parameter models for handling intermittent missing values and dropouts that are potentially nonignorable according to various criteria. Within the framework of multiple partial imputation, intermittent missing values are first imputed several times; then, each partially imputed data set is analyzed to deal with dropouts with or without further imputation. Depending on the choice of imputation model or measurement model, there exist various strategies that can be jointly applied to the same set of data to study the effect of treatment or intervention from multi-faceted perspectives. For illustration, the strategies were applied to a data set with continuous repeated measures from a smoking cessation clinical trial.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The average and SD curves for the log-scaled carbon monoxide levels. On this plot, the four mean curves of the log-scaled carbon monoxide levels and the corresponding pointwise standard errors are drawn for each of the four treatment conditions: Control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management). Vertical bars indicate the estimated standard errors of average carbon monoxide levels. The stars (‘*’) over the x-axis mark the time points (i.e. visit numbers), where the carbon monoxide levels are significantly different indicated by a pointwise ANOVA (p-value<0.001). Y -axis indicates values of carbon monoxide levels after log(1+ x) transform. X-axis represents number of clinic visit for study participants (1, …, 36; three times per week).
Figure 2
Figure 2
Missingness patterns for the carbon monoxide levels across treatment conditions. For each treatment condition, an image depicts the missingness indicators of carbon monoxide levels for each smoker at each research visit. Dark colored area indicates that the corresponding carbon monoxide levels were observed while white colored area indicates that the corresponding data were missing intermittently or missing after dropout. The four treatment conditions are control, RP-only, CM-only, and RP+CM (RP = relapse prevention, CM = contingency management).
Figure 3
Figure 3
Mean carbon monoxide levels for completers and early terminators. By dividing the 174 smokers into two groups: completers (n1 = 112) and early terminators (n1 = 62), the mean curves of carbon monoxide levels for subjects receiving CM (contingency management) and for subjects receiving no CM are depicted within each of the two groups (completers and early terminators).
Figure 4
Figure 4
Plate 1. Pattern-dependent distribution of carbon monoxide levels. Using the software package named ‘MPI 2.0’, profiles and mean curves of carbon monoxide levels are drawn within each of the five groups determined by the dropout times: dropout at or before week 5, 7, 9, 11, and 12. In plots, green curves correspond to the mean carbon monoxide levels of subjects who received CM (contingency management), red curves indicate the mean curves of the subjects who did not receive CM, and gray-colored dash-lines depict the profiles of all the subjects within each group. The bottom-right plot depicts all the mean profiles corresponding to the five dropout patterns.

References

    1. Nich C, Carroll KM. ‘Intention-to-treat’ meets ‘missing data’: implications of alternate strategies for analyzing clinical trials data. Drug and Alcohol Dependence. 2002;68:121–130. - PMC - PubMed
    1. Hedeker D, Gibbons RD. A random effects ordinal regression model for multilevel analysis. Biometrics. 1994;50:933–944. - PubMed
    1. Follmann D, Wu M. An approximate generalized linear model with random effects for informative missing data. Biometrics. 1995;51:151–168. - PubMed
    1. Liang KY, Zeger SL. Longitudinal data analysis using generalized linear models. Biometrika. 1986;73:13–22.
    1. Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. - PubMed

Publication types